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Abstract 

Parametrized families of PDEs arise in various contexts such as inverse problems, 
control and optimization, risk assessment, and uncertainty quantification. In most of 
these applications, the number of parameters is large or perhaps even infinite. Thus, 
the development of numerical methods for these parametric problems is faced with 
the possible curse of dimensionality. This article is directed at (i) identifying and 
understanding which properties of parametric equations allow one to avoid this curse 
and (ii) developing and analyzing effective numerical methodd which fully exploit these 
properties and, in turn, are immune to the growth in dimensionality. 

The first part of this article studies the smoothness and approximability of the 
solution map, that is, the map a i—)• u{a) where a is the parameter value and u{a) is 
the corresponding solution to the PDE. It is shown that for many relevant paramet¬ 
ric PDEs, the parametric smoothness of this map is typically holomorphic and also 
highly anisotropic in that the relevant parameters are of widely varying importance 
in describing the solution. These two properties are then exploited to establish con¬ 
vergence rates of n-term approximations to the solution map for which each term is 
separable in the parametric and physical variables. These results reveal that, at least 
on a theoretical level, the solution map can be well approximated by discretizations of 
moderate complexity, thereby showing how the curse of dimensionality is broken. This 
theoretical analysis is carried out through concepts of approximation theory such as 
best n-term approximation, sparsity, and n-widths. These notions determine a priori 
the best possible performance of numerical methods and thus serve as a benchmark for 
concrete algorithms. 

The second part of this article turns to the development of numerical algorithms 
based on the theoretically established sparse separable approximations. The numerical 
methods studied fall into two general categories. The first uses polynomial expansions 
in terms of the parameters to approximate the solution map. The second one searches 
for suitable low dimensional spaces for simultaneously approximating all members of 
the parametric family. The numerical implementation of these approaches is carried 
out through adaptive and greedy algorithms. An a priori analysis of the performance 
of these algorithms establishes how well they meet the theoretical benchmarks. 

*This research was supported by the ONR contracts N00014-11-1-0712 and N00014-12-1-0561, the NSF 
grant DMS 1222715, the Institut Universitaire de France and the ERG advanced grant BREAD. 
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1 Overview 


1.1 Parametric and stochastic PDEs 

Partial differential equations (PDEs) are commonly used to model complex systems in a va¬ 
riety of physical contexts. When solving a given PDE, one typically Exes certain parameters: 
the shape of the physical domain, the diffusion or velocity held, the source term, the hux or 
reaction law, etc. We use the terminology parametric PDEs when some of these parameters 
are allowed to vary over a certain range of interest. When treating such parametric PDEs, 
one is interested in Ending the solution for all parameters in the range of interest. 

To describe such problems in their full generality, we adopt the formulation 

P(M,a) = 0, (1.1) 

where a denotes the parameters, u is the unknown of the problem, and 

r-.VxX^W, (1.2) 

is a linear or nonlinear partial difierential operator, with {X, V,W) a triplet of Banach spaces. 
We assume that the parameter a ranges over a compact set A <Z X, and that for any a in 
this range there exists a unique solution u = u{a) G V to fll.ip . This allows us to deEne the 
solution map 

u : u{a), (1.3) 

which acts from X onto V and is well deEned over A. We also deEne the solution manifold 
as the family 

A4 = u(A) = {u(a) : a G .4,}, (1.4) 

which gathers together all solutions as the parameter varies within its range. 

One simple guiding example, which will be often used throughout this article, is the linear 
elliptic equation 

— div(aVM) = /, on D, 

M = 0 on dD, (1.5) 

set on a Lipschitz domain D C M'”. Here, we Ex the right side f as a real valued function and 
consider real valued difiusion coefEcients a as the parameter. The corresponding operator V 
is therefore given by 

V{u,a) = f + div{aVu). (1.6) 

A possible choice for the triplet of spaces is then 

(X, V, W) = {L^{D), Hl{D), H-\D)). (1.7) 

Indeed, if m G E, a G X and / G W, one then deEnes V{u,a) as an element of W by 
requiring that 

{P{u,a),v) = {f,v) + JaXu-Xv, n G E, (1.8) 

D 
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where (•, •) is the duality bracket between V' = W and V. Lax-Milgram theory ensures the 
existence and uniqueness of a solution u(a) to fll.ll) from V, if for some r > 0, the diffusion 
a satishes the ellipticity condition 


a(x) > r, X e D. (1.9) 

Therefore, a typical parameter range is a set ^ C {a G L°°{D) : a > r}, which in addition 
is assumed to be compact in 

Although elementary, the above example gathers important features that are present in 
other relevant examples of parametric PDEs. In particular, the solution map a ^ u{a) 
acts from an inhnite dimensional space into another inhnite dimensional space. Also note 
that, while the operator V of fll.6l) is linear both in a and u (up to the constant additive 
term /) the solution map is nonlinear. Because of the high dimensionality of the parameter 
space X, such problems represent a significant challenge when trying to capture this map 
numerically. One objective of this article is to understand which properties of this map allow 
for a successful numerical treatment. Concepts such as holomorphy, sparsity, and adaptivity 
are at the heart of our development. 

The solution map may also be viewed as a function 

(x, a) I—)■ n(x, a) (1-10) 

of both the physical variable x E D and the parametric variable a E A. The parametric 
variable has a particular status because its different instances are uncoupled: for any fixed 
instance a = Oq, we may solve the PDE exactly or approximately and therefore compute 
u{x, Oq) for all values of x, while ignoring all other values a ^ Qq. This plays an important role 
for certain numerical methods which are based on solving the parametric PDE for different 
particular values of a, since this task can be parallelized. 

Parametric PDEs occur in a variety of modeling contexts. We draw the following major 
distinctions in their setting: 

• Deterministic modeling: the parameters are deterministic design or control vari¬ 
ables which may be tuned by the user so that the solution u, or some quantity of 
interest Q{u), has prescribed properties. For instance, if the elliptic equation fll.Sp is 
used to model the heat conduction in a component produced by an industrial process, 
one may want to design the material in order to minimize the heat flux on a certain 
part on the boundary P C dD, in which case the quantity of interest to be optimized 
is 

Q{u) = j aVu ■ n, (1-11) 

r 

where n is the outer normal. This amounts to optimizing the function 

a i-G- F{a) := Q{u{a)). (1-12) 


over A. 
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• Stochastic modeling: the parameters are random variables with prescribed proba¬ 
bility laws, which account for uncertainties in the model. Therefore a has a certain 
probability distribution p supported on A. One is then typically interested in the re¬ 
sulting probabilistic properties of the solution u, which is itself a random variable over 
A with values in V, or in the probabilistic properties of a quantity of interest Q{u). For 
instance, if the elliptic equation fll.Sp is used to model oil or ground water diffusion, 
a common way to deal with the uncertainty of the underground porous media is to 
dehne a as a random held with some prescribed law. Then one might want to estimate 
the mean solution 

u := E(m), (1-13) 

or its variance 

Viu) :=E(||m-F||2^), (1.14) 

or the average hux through a certain interface T, that is, E(Q(m)) with Q{u) as in 
fll.lip . or the probability that this hux exceeds a certain quantity, etc. 

In both the deterministic and stochastic settings, the given application may require eval¬ 
uating u{a^) for a large number n 3> 1 of instances {a^,..., a”} of the parameter a. This is 
the case, for instance, when using a descent method for optimizing a quantity of interest in 
the deterministic framework, or a Monte-Carlo method for evaluating an expectation in the 
stochastic framework. Since each individual instance u{a) is the solution of a PDF, its exact 
evaluation is typically out of reach. Instead, each query for u{a^) is approximately evaluated 
by a numerical solver, which may itself be computationally intensive for high accuracy. 

In order to signihcantly reduce the number of computations needed for attaining a pre¬ 
scribed accuracy, alternate strategies, commonly referred to as reduced modeling, have been 
developed. Understanding which of these strategies are effective, in the case where the 
parameter has large or inhnite dimension, and why, is the subject of this article. 

1.2 AfRne representation of the parameters 

So far our description of the set A of parameters allows it to be any compact subset of X. 
An important ingredient in both our theoretical and numerical developments is to identify 
any a G M through a sequence of real numbers. We are especially interested in affine 
representations of A. We say that a sequence of functions ^lJj G X is an affine 

representer for A, or representer for short, if we can write each a as 

a = a{y) = a + ^yj'iljj, y ■= {yj)j>i, yj=yj{a), (1.15) 

i>i 

where the yj are real numbers, a is a fixed function from X, and the series converges in the 
norm of X for each a G A. We are making a slight abuse of notation here since we use a to 
represent a general element of A and also use a to represent the map 

a:ye^a{y), (1.16) 
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from to X. But the meaning will always be clear from the context. 

It is easy to see that for any compact set ^ in a Banach space X affine representers exist. 
For example, if X has a Schauder basis then any such basis will be a representer. Even if 
X does not admit a Schauder basis, as is the case for our example X = L°°{D), we can still 
hnd representers as follows. Choose any a E A. Since /C ^ — a is compact there exist 
hnite dimensional spaces (X„)„>o, with dim(X„) = n, such that 

dist(/C, := sup min ||a — 6||x —^ 0 as n —)■ cxo. (1-17) 

ae/c bGXn 

We can also take the spaces X^ to be nested, that is 

Xn C Xn+1, n > 0. (1.18) 

Let {4>j,n)j=i,...,n be any basis for Xn and dehne Nn ■= n{n — l)/2. The sequence 

ipj := n, j = Nn,..., Nn+i, (1.19) 

contains each of the bases for all n > 1. Given any a E A, let an be a best 

approximation in X to a — a from X„,with Oq := 0. Then, we can write 

OO 

a a ~\~ ^ ^ 

n=l 

Each term a„ — a„_i is in and hence can be written as a linear combination of the ipj. 
Therefore, is a representer for A. 

Affine representations fll.lSp often occur in the natural formulation of the parametric 
problem. For instance, if the diffusion coefficient a in fll.Sp is piecewise constant over a hxed 
partition of the physical domain D, then it is natural to set 

d 

a{y) = a + ^yjXD^, ( 1 - 21 ) 

i=i 

where a is a constant and the Xdj are the characteristic functions of the subdomains Dj. 
Similarly, if the parameter a describes the shape of the boundary of the physical domain in 
a computer-aided design setting, a typical format is 

d 

a{y) = a + '^yjBj, (1.22) 

where a represents a nominal shape and the Bj are B-spline functions associated to control 
points. In these two examples, d is hnite, yet possibly very large. 

In the statistical context, if a is a second order random held over a domain D, a frequently 
used choice in fll.lSp is d := E(a), the average held, and {'ipj)j>o, the Karhunen-Loeve basis, 
that is, the eigenfunctions of the covariance operator 

V ^ RaV := J Ca{-,x)v{x)dx, Ca{z,x) :=E,((a{z) —d{z)){a{x) —d{x)). (L23) 

D 
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Then, the resulting scalar variables are centered and uncorrelated, that is, ]E(|/j) = 0 and 
= 0 when ij^j. 

Even if an affine representation of the form fll.151) is not given in the formulation of the 
problem, one can be derived by taking any representation system in the Banach space 

X. For example, if X admits a Schauder basis, then one can take any such basis {'ipj)j>i 
for X and arrive at such an expansion. In classical spaces X, such as or Sobolev spaces, 
standard systems of approximation, such as Fourier series, splines, or wavelets can be used. 

The advantage of the representation fll.151) is that a can now be identified through the 
sequence {yj)j>i- When considering all a G we obtain a family of such sequences. Note 
that this family can be quite complicated. In order to simplify matters, we normalize the 
ijjj, so that for any j, 

sup |%(a)| = 1. (1.24) 

Such a renormalization is usually possible because A is compact and yj{a) depends continu¬ 
ously on a. After this normalization, for each a E A, the sequence (|/j(a))j>i belongs to the 
inhnite dimensional cube 

f/:=[-l,l]^. (1.25) 

Notice that taking a general sequence {yj)j>i from this cube, there may not be an a G ^ 
with yj = yj{a), j > 1. Also, if {'ipj}j>i is not a basis , the representation fll.151) may not be 
unique. We define 

Ua ■= {{yj)j>i e u : ^ 

j>i 

We are mainly interested in representers a, (t/^j)j>i for which 

u + (1-27) 

j>i 

converges in X for each (yj) G U. We call such representers complete. In this case, we may 
define 

a{U) := {a = a{y) = a + ^yj'i/jj : {yj)j>iEU}, (1.28) 

i>i 

so that 

Aca{U). (1.29) 

A typical case of a complete representer is when (||'0i||x) is a sequence in f'^(N). 

Once an affine representation has been chosen, the initial solution map a ^ u{a) becomes 
equivalent to the map y i-G- u{a{y)) which is dehned on U^. With an abuse of notation, we 
write this new solution map as 


y ^ u{y) := u{a{y)). (1.30) 

This is a Banach space valued function of an infinite number of variables. Note that in the 
case where the affine representation has a finite number d of terms, the range of ?/ is [—1,1]'^. 
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However, the infinite dimensional case subsumes the finite dimensional case, since the latter 
may be viewed as a particular case with ijjj = 0 for j > d. 

In the case of a complete representer, a{y) is defined on all of U. However, we do not 
know whether the solution map u is defined on all of U. To guarantee this, the following 
assumption will be used often. 

Assumption A: The parameter set A has a complete representer {'ipj)j>i and the solu¬ 
tion map a i—)■ u{a) is well defined on the whole set a{U), or equivalently the solution map 
y u{y) is well defined on the whole set U. 

This assumption naturally holds when the set A is exactly defined as a{U). 

1.3 Smoothness of the solution map 

One objective of this article is to develop efficient numerical approximations to the solution 
maps of fll.Sp or fll.SOp . One of the main difficulties is that these maps are high or infinite 
dimensional, in the sense that the dimension of the variable a or y is high or infinite. In 
order to understand what might be good strategies for constructing such approximations, 
we need first to understand the inherent properties of these maps that might allow us to 
circumvent this difficulty. 

We initiate such a program in ^ where we first analyze the smoothness of the solution 
map a u{a). In the case of the elliptic equation fll.Sp . it is easily seen that this map is not 
only infinitely differentiable, but also admits a holomorphic extension to certain subdomains 
of the complex valued X = L°°{D). We propose two general approaches which allow us to 
establish similar holomorphy properties for other relevant instances of linear and nonlinear 
parametric PDEs. One first approach is based on the Ladyzenskaia-Babushka-Brezzi the¬ 
ory. It applies to a range of linear PDEs where the operator and the right hand side have 
holomorphic dependence in a. These include parabolic and saddle-point problems, such as 
the heat equations or the Stokes problem, with parameter a in the diffusion term, similar 
to fll.Sp . One second approach is based on the implicit function theorem in complex valued 
Banach spaces. In contrast to the first approach, it can be applied to certain nonlinear 
PDEs. 

Using the affine representation (ll.lSp of a, we then study the solution map y i-4 u{y) 
under Assumption A, which means that it is defined on the whole of U. In addition to 
holomorphy, an important property of u can be extracted from the affine representation 
fll.lSp . The functions fij appearing in fll.lSp have norms ||'0j||v of varying size. Since the 
variable yj is scaled to be in [—1,1], when |l'0j||x: is small, this variable has a reduced effect 
on the variations of u{y). Thus the variables {yj)j>i are not democratic, but rather they 
have varying importance. In other words, the map y i—)■ u{y) is highly anisotropic. More 
specifically, we derive holomorphic extension results for this map on certain multivariate 
complex domains of tensor product type. In particular, we consider polydiscs of the general 
form 

Up := < Pj} = {z = e : \zj\ < pj) (1.31) 
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where p = {pj)j>i is a positive sequence which serves to describe the anisotropy of the 
solution map. We also consider polyellipses which deviate less far from the real axis. These 
holomorphy domains play a key role in the derivation of approximation results. 

Remark 1.1 While we are generally interested in real valued solutions u to the parametric 
PDE fll.ip . corresponding to real valued parameters a or y, our analysis of holomorphic 
smoothness leads us naturally to complex valued solutions, corresponding to complex valued 
parameters. For this reason, the spaces X, V, W are always assumed to be complex valued 
Banach spaces throughout this paper. 

1.4 Approximation of the solution map 

Reduced modeling methods seek to take advantage of the properties of the solution maps 
a 1 -^ u{a) oi y u{y) such as the holomorphy and anisotropy mentioned above. These 
properties suggest strategies for approximating these map u by simple functions Un in which 
the physical variables x and the parametric variable a or y are separated and hence take the 
form 

n 

(x, a) H- Un(x, a) := ^ Vi(x)(/)i(a), (1-32) 

i=l 

or 

n 

{x, y) Un{x, y) := ^ Vi{x)(pi{y), (1.33) 

i=l 

where {ui,... ,Vn} are functions of x living in the solution space V and {0i,..., are 
functions of a or y with values in M or C. 

We may view Un as a rank n approximation to u, in analogy with low rank approximation 
of matrices. We adopt the notations 

Un{a) =Un{-,a) and y H-«„(?/)= n„(-, y), (1.34) 

for the above approximations. 

Let us discuss the potential accuracy of separable approximations of the form fll.32p . If 
our objective is to capture u{a) for all a G A with a prescribed accuracy e{n), this means 
that we search for an error bound in the uniform sense, i.e., of the form 

\\u - Un\\L°-{A,v) ■= sup ||M(a) - Un{a)\\v < e{n). (1.35) 

For certain applications, in particular in the stochastic framework, we may instead decide 
to measure the error on average, for instance by searching for an error bound in the mean 
square sense, 

E{\\u - UnWv) ■= l|u - Unlli2(AFM) = J “ Mn(a)||yd/i(a) < e{nf, 

A 
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where is the probability measure for the distribution of a over A. Since p is a probability 
measure, one has for any n, 

< \\v\\l<^(_a,,V)- ( 1 - 37 ) 

Therefore the uniform bound is stronger than the average bound, in the sense that fll.35p 
implies fll.36p with the same value of e{n). 

Likewise, for the approximation of the map y u{y), we may search for a uniform bound 
\\u - Un\\L°-{UAy) ■= sup \\u{y) - Un{y)\\v < £{n) (1.38) 

V&Ua 


or a mean square bound 

E(||u - UnWl) ■= h - Un\\l2(UA,vA = j W'^^v) “ ||yd/i(|/) < e{nf , (1.39) 

Ua 

where y is the probability measure for the distribution of y over f/ 4 . 

Remark 1.2 We do not indicate the measure y in our notation L°°{A, V) or L^{U_a, V) , 
since in all relevant examples considered in this article we always consider the exact supre- 
mum over a in A or over y in t/ 4 , rather than the essential supremum. 

For any a E A, the approximation Un{a) belongs to 

14 := span{ni,..., n„}, (1.40) 

which is a hxed n-dimensional subspace of V. Ideal benchmarks for the performance of sep¬ 
arable expansions of the form fll.321) may thus be defined by selecting optimal n-dimensional 
spaces for the approximation of u{a) in either a uniform or an average sense. 

For uniform error bounds, this benchmark is given by the concept of Kolmogorov’s n- 
width, which is well known in approximation theory. If /C is a compact set in a Banach space 
V, we dehne its Kolmogorov n-width as 

dn{)C)v ■= inf sup min ||n — tc||y. (1-41) 

d\ra(Vn)<n "'GVn 

This quantity, hrst introduced in [57], describes the best achievable accuracy, in the norm of 
V, when approximating all possible elements of /C by elements from a linear n-dimensional 
space Vn- Obviously, the optimal choice of V* for approximation of n(a) in a uniform sense 
corresponds to the space, if it exists, that reaches the above infimum when K, is taken to be 
the solution manifold M.. The best achievable error in the uniform sense is thus given by 

e{n) := dn{M)v- (1.42) 

There exist many other notions of widths that are used to measure the size of compact sets. 
Here we only work with the above one and refer to m for a more general treatment. 
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For mean square bounds, in the case where id is a Hilbert space, the corresponding 
benchmark is related to the concept of principal component analysis, which is of common 
use in statistics. By choosing an arbitrary orthonormal basis (ej)j>o of V, we may expand 
u(a) according to 

u(a) = '^Ziei, Zi = Zi(a) := (u(a),ei}v (1-43) 

i>0 

Approximation of u(a) from an n-dimensional space of V is then equivalent to the approx¬ 
imation of z(a) := (zi(a))i>o from an n-dimensional space of £^(N). The optimal space is 
then obtained through the study of the correlation operator 

R = (Rij)ij>o, Rij :=E{zi{a)zj{a)) = J Zi{a)zj{a)dn{a). (1.44) 

This operator is symmetric, positive, compact and of trace class. It therefore admits 
an orthonormal basis of eigenvectors (gfc)fc>i associated to a positive, non-increasing and 
summable sequence {Xk)k>i of eigenvalues. The space 

:= span{gi,... ,g„}, (1.45) 

minimizes over all n-dimensional spaces G the mean square error between z and its orthogonal 
projection Pg^, with 

E(||z-Pg.z|Ip) = E^‘' 

k>n 

In turn, the optimal space V* is spanned by the functions 

Vk-=^9k,iei, k = l,...,n, (1.47) 

i>0 

where gk,i is the nth component of gk, that is, gk = (gk,i)i>o- The best achievable error in 
the mean square sense is thus given by 

£(ny:='^Xk. (1.48) 

k>n 

Note that, in view of fll.37p . one has the comparison 

^Xk<dn(Mfy. (1.49) 

k>n 

The above described optimal spaces are usually out of reach, both from an analytic 
and computational point of view, and it is therefore interesting to consider sub-optimal 
approximations. In addition, when considering the solution map y i—)■ u(y), the tensor 
product structure of U allows us to consider approximations based on further separation 
between the parametric variables, that is, where each function in fll.33p is itself a product 
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of univariate functions of the different Uj. The simplest example of such approximations are 
multivariate polynomials, which have the general form 

Un{x,y) = y" := 

1^6 An j>l 

where each index z/ = G is a finitely supported sequence of positive integers, or 

equivalently such that \u\ = X]j>i is a set of n such sequences. 

In §3, we obtain such polynomial approximations by taking finite portions of inhnite 
polynomial expansions of u. Here, we work under Assumption A, which means that u{y) 
is defined on the whole of U. We consider two types of expansions: 

• Power series the form 

'^(y) = 

where T is the set of all finitely supported sequence of positive integers. 

• Orthogonal series of the form 

“( 2 /) = P^{y) ■.= W_P^.{yj), (1.52) 

u&r j>i 

where Pk is the Legendre polynomial of degree k defined on [—1,1]. 

Using the holomorphy and anisotropy properties of the solution map y ^ u{y) established in 
§2 for specific classes of parametric PDEs, we derive a priori bounds on the U-norms ||ti/||v 
and ||tCy||y of the coefficients which appear in these expansions. In this way, we are able 
to establish algebraic convergence rates n~^ for certain truncations of the above expansions, 
where n is the cardinality of the truncation set A„. 

One critical aspect of the truncation strategy is that we retain the n largest coefficients, 
which is a form of nonlinear approximation, also known as sparse or best n-term approx¬ 
imation. With such a choice for A„, the exponent s in the convergence rate is related to 
the available summability for p < 1 of the U-norms of the coefficients in the considered 
inhnite expansion. In particular, for uniform approximation estimates, that is, in L°°{U, V), 
one has 

s=--l, (1.53) 

p 

once the ^^-summability of these U-norms has been proven. The main result from §3 shows 
that, under suitable assumptions, the (P summability of the sequence {\\'il^j\\x)j>o implies 
that the norms of the coefficients in the expansions fll.511) or fll.52p are also summable. 

The fact that we obtain the algebraic convergence rate despite the inhnite dimensional 
nature of the variable {yj)j>i reveals that the curse of dimensionality can be avoided in the 
approximation of relevant parametric PDEs. 
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1.5 The n-widths of the solution manifold A4 

In both the uniform or mean square ways of measuring error, as described above, the success 
of reduced modeling can be proven if dn{M.)v converges to zero sufficiently fast as n —)■ +cxo. 
Thus, the study of these widths constitutes a major subject in the theoretical justification 
of reduced modeling. 

A common way to measure the widths of compact sets K, in classical spaces is to embed 
/C into an appropriate smoothness space such as a Sobolev or Besov space. For example, 
in our model parametric elliptic equation fll.Sp . the space V is Hq{D), and this approach 
would lead us to examine the Sobolev smoothness of the individual functions u{a) for 
m > 1. Classical elliptic regularity theory says that for smooth domains the smoothness of 
u{a) can be inferred from the smoothness of the right side /. However, for general domains, 
there are severe limits on this regularity due to the irregularity of the boundary. Therefore, 
bounding the decay of widths of Ad through such regularity results will generally prove only 
slow decay for dn{M.)v and therefore is not useful for obtaining the fast decay rates we seek. 
Indeed, recall, that classical smoothness spaces, such as Sobolev or Besov spaces of order m 
in d variables, have widths that decay, at best, like as n —)■ cxo when these widths 

are measured in an W''’^ norm for some r < m. For example, it is known that if /C is the 
unit ball of C™'([—1, then its n-width in satisfies 

< d„(/C)Loc < n > 1. (1.54) 

Likewise, if /C is the unit ball of 1,1]“^), then its n-width in satishes 

c^n-(rn-i)/d < n>l. (1.55) 

Also note that the regularity of u{a), as measured by membership in Sobolev and Besov 
spaces, is closely related to the performance of piecewise polynomial approximation such as 
that used in finite element methods and is the reason why these algorithms have rather slow 
convergence. So the fast decay of dn{M.)v to zero cannot be obtained by such an approach. 

The widths dn{M.)v go to zero fast not because the individual elements in M. are smooth 
in the physical variable, but rather because M. is the image of the solution map u which is, 
as previously discussed, smooth and anisotropic in the parametric variable. However, let us 
remark that it is by no means trivial to deduce fast decay for dn{Ai)v from this fact alone, 
because of what is called the curse of dimensionality. Namely, approximation rates for a 
given target function are generally proved by showing that the target function, in our case 
the function u, has sufficiently high regularity. But, in high dimensions, regularity by itself is 
usually not enough. Indeed, returning to the bounds fll.55p . we see that the large dimension 
d affects the approximation rate in two detrimental ways. The exponent in the smoothness 
rate is divided by d and the constants Cd are known to grow exponentially with increasing 
d. In our case d can even be inhnite and this makes the derivation of approximation rates a 
subtle problem. 

In §4, we discuss general principles for estimating by above the n-widths dn{M.)v of the 
solution manifold A4. One immediate consequence of the approximation results in §3 is that. 


12 






for specific classes of parametric PDEs, when Assumption A holds and (||V^j||x)j>o ^ 
then these widths decay at least like n~^ with s given by 01.531) . We extend this analysis to 
the general case when Assumption A does not necessarily hold. Since the manifold Ai is 
not directly accessible, one would like to understand what properties of A, which is assumed 
to be completely known to us, imply decay rates for dn(M). We show that the asymptotic 
decay of the n-width oiAiinV is related to that of the n-width of A in X. This follows 
from general results on widths of images of compact sets under holomorphic maps. Namely, 
if M : A —)■ 1/ is holomorphic in a neighborhood of a general compact set A C A, then we 
prove the following result on the n-width of the image A4. = u{A) in V: 

sup rf dn{A)x < oo ^ sup < oo, s < r — 1. (1.56) 

n>l n.>l 

This result shows that from the view point of preserving n-widths, holomorphic maps behave 
almost as good as linear maps, up a loss of 1 in the convergence rate. One open problem is 
to understand if this loss is sharp or could be improved. 

1.6 Numerical methods for reduced modeling 

The above mentioned results on holomorphic extensions, sparse expansions for m, and n- 
widths of the manifold At, can be thought of as theoretical justihcations for the role of 
reduced modeling in solving parametric and stochastic equations. They provide evidence 
that reduced modeling numerical methods should yield signihcant computational savings 
over traditional methods such as hnite element solvers for parametric problems or Monte 
Carlo methods for stochastic problems. However, they do not constitute actual numerical 
methods. 

The second part of our article turns to the construction of numerical algorithms motivated 
by these theoretical results. Such algorithms compute specihc separable approximations 
of the form fll.32p or fll.33p for a given value of n at an affordable computational cost. 
Our objective, in this regard, is not to give an exhaustive description of numerical reduced 
modeling methods and their numerous variants. Rather, our main focus is to introduce some 
important representative examples of these methods for which an a priori analysis quantihes 
the gains in numerical performance of these methods. 

One important distinction is between non-adaptive and adaptive methods. In the hrst 
ones, the choice of the functions {0i,..., 0„} or of {ui,..., Vn} used in fll.32p or fll.33p . for 
a given value of n is made in an a priori manner, if possible using the available information 
on the problem. In the second ones, the computations executed for lower values of n are 
exploited in order to monitor the choice at stage n. One desirable feature of an adaptive 
algorithm is that it should be incremental or greedy: only one new function (pn or Vn is added 
at each stage to the n — 1 previously selected functions which are left unchanged. Adaptive 
algorithms are known to often perform better than their non-adaptive counterpart, but their 
convergence analysis is usually more delicate. 

A second important distinction between the various numerical methods is whether they 
are non-intrusive or intrusive. A non-intrusive algorithm builds on an existing exact or ap- 
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proximate solver for the PDE which may be computationally expensive. It derives approxi¬ 
mations of the form fll.32p or fll.33p by choosing instances a^,a"' G ^ or G f /4 

and using the values u{a^) or u{y^) computed by the solver. Non-intrusive algorithms may 
be implemented even when this solver is a black box, and therefore with a possibly limited 
knowledge on the exact PDE model. An intrusive algorithm, on the other hand, directly 
exploits the precise form of the PDE for computing the approximation fll.32p or fll.33p . and 
therefore requires the full knowledge of the PDE model for its implementation. 

It should be noted that instances n(a*) as well as the functions {ui,..., used in fll.32p 
or fll.33p can only be computed with a certain level of spatial discretization, for example 
resulting from a finite element solver. In such a case they belong to a finite dimensional 
space 14 C V. If the same hnite element space I 4 is used to discretize all instances u{a) 
up to a precision that is satisfactory for the user, this means that we are actually trying to 
capture the approximate solution maps 

a Uh{a) e Vh, (1-57) 

or 

y H- Uh{y) G 14 . (1.58) 

The analysis of the performance of numerical reduced modeling methods needs to incorporate 
the additional error produced by this discretization. 

1.7 Sparse polynomial approximation algorithms 

One first class of methods that we analyze consists in finding a numerically computable 
polynomial approximation of the form fll.50p . There are two major tasks in constructing 
such a numerical approximation: (i) End good truncation sets (A„)„>i and (ii) numerically 
compute an approximation to the coefficients Vy for each v G A„. By far, the most significant 
issue in numerical methods based on polynomial expansions is to hnd a good choice for the 
sets (A„)„>i. If everything was known to us, we would simply take for A„ the set of indices 
V corresponding to the n largest I/-norms of the coefficients in fll.511) or fll.52p . However, 
Ending such an optimal A„ would require in principle that we compute the coefEcients for all 
values V ^ T which is obviously out of reach. In addition, the structure of the optimal set A„ 
can be quite complicated. One saving factor is that our analysis in §3 gives a priori bounds 
on the size of these coefEcients. These bounds can be used in order to make an a priori 
selection of the sets A„. This is a non-adaptive approach, which generally gives suboptimal 
performance due to the possible lack of sharpness in the a priori bounds. This leads one 
to try to enhance performance by combining the a priori bounds together with an adaptive 
selection of the sets (A„)„>i. 

The numerical methods are facilitated by imposing that the selected index sets A„ are 
downward closed (or lower sets), i.e. satisfy the following property: 

p E Kn and z> < z/ ^ E G A„, (1.59) 

where u < u means that uj < vj for all j. 
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In §6, we discuss algorithms which compute the polynomial approximation by an in¬ 
terpolation process. These algorithms are non-intrusive and apply to a broad scope of 
problems. One key issue is the choice of the interpolation points, which is facilitated by 
the following result: if {zk}k>o is a sequence of pairwise distinct points in [—1,1] and if 
Zu := {zuj)j>i G U, then for any downward closed set A„, any polynomial of the form fll.SOp 
is uniquely characterized by its value on the grid {zi, : v G A„}. This allows us to construct 
the interpolation in a hierarchical manner, by simultaneously incrementing the polynomial 
space and the interpolation grid. The sets A^ can either be a priori chosen based on the 
bounds on the coefficients established in §3 or adaptively generated. These sets generally 
differ from the ideal sets corresponding to the n largest coefficients (which may not fullhll 
the downward closedness property). We show that certain choices of the univariate sequence 
{zk}k>o-i known as Leja or i?-Leja points lead to stable interpolation processes (in the sense 
that the Lebesgue constant have moderate growth with the number of points) allowing us to 
retrieve by interpolation the same algebraic convergence rate n~^ which are proved for the 
best polynomial approximations. 

In §7, we discuss another class of algorithms which recursively compute the exact coef- 
hcients in the Taylor series of the approximate solution map fll.581) . These algorithms are 
intrusive and they only apply to problems where V is linear both in u and a up to a constant 
term, such as the elliptic equation fll.51) which serves a guiding example. The recursive com¬ 
putation is facilitated by imposing that the index sets A„ in the truncated Taylor expansion 
are downward closed. Similar to the interpolation algorithm from §6, the sets A„ can either 
be a priori chosen based on the available bounds on the coefficients established in §3, or 
adaptively generated. One main result shows that adaptive algorithms based on a so-called 
bulk chasing procedure have the same convergence rate n~^ as the one which is established 
when using the index sets corresponding to the n largest Taylor coefficients. 

1.8 Reduced basis algorithms 

The second class of methods that we analyze seeks to hnd, in an offline stage, a set of func¬ 
tions {ui,..., Vn} for which the resulting n-dimensional space 14 := span{ui,..., Vn} is close 
to an optimal linear n-dimensional approximation space. For mean square estimates, one 
such approach, known as proper orthogonal decomposition^ consists in building the functions 
{ui,... ,Vn} based on an approximation of the exact covariance operator fll.44p computed 
from a sufficiently dense sampling of the random solution u{a). Another approach, which 
targets uniform estimates, is the reduced basis method, which consists in generating 14 by a 
selection of n particular solution instances {^(a^),..., u{a‘^)} chosen from a very large set of 
potential candidates. In both cases, the offline stage is potentially very costly. 

Once such a space 14 is chosen, one builds an online solver, such that for any given a E A, 
the approximate solution Un{a) is an element from 14- There are several possibilities on how 
to build this online solver. The most prominent of these is to take the Galerkin projection 
of u{a) onto 14 which consist of hnding Un{a) by solving the system of equations 

{V{un{a),a),w) = 0, w E Vn (1.60) 
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for a suitable duality product (•,•). This online computation determines for each a the 
values (pi{a) for i = 1,... ,n which appears in fll.32p . The advantage of using the Galerkin 
solver, is that, for certain problems such as elliptic ones, it is known to give the best error 
in approximating u{a) by elements of Vn when error is measured in the norm of V. Its 
disadvantage its computational cost in hnding Un{(i) given the query a. For this reason other 
projections onto Vn are also studied, some of these based on interpolation. 

The key issue when using reduced basis methods is how to find a good space 14, i.e. 
how to find good basis functions. In §8, we discuss an elementary greedy strategy for the 
offline selection of the instances Vi = u{a''), that consists in picking the n-th instance which 
deviates the most from the space I4-i generated from the n — 1 previously selected ones. 
The approximation error 

o'n(-Ad) := sup inf ||u —tc||, (1-61) 

vGM 

produced by such spaces may be signihcantly larger than the ideal benchmark of the n-width 
of the solution manifold for a given value of n. However, a striking result is that both are 
comparable in terms of rate of decay: for any s > 0, there is a constant Cg such that 

sup < 4s sup (1.62) 

n>l n>l 

Similar results are established for exponential convergence rates. 

While both classes of numerical methods aim to construct separable approximations of 
the form fll.32p or fll.33p . there is a signihcant distinction between them in the way they 
organize computation. For the hrst class of polynomial approximation methods, the offline 
stage Axes the polynomial functions through the selection of the set A„ and computes the 
coefficients Uj. Then the online stage is in some sense trivial since it simply computes Un{y) 
through the linear combination fll.5Up . For the second class, the online stage still requires 
solving PDF approximately in the chosen reduced space 14- This offline/online splitting 
makes it difficult to draw a fair comparison between the different methods from the point of 
view of computational time vs accuracy. 

Notation for constants: Numerous multiplicative constants appear throughout this paper, 
for example in convergence estimates. We use the generic notation C, which may there¬ 
fore change value between different formulas, and if necessary we indicate the parameters on 
which C depends. We use a more specific notation if we want to express the dependence of 
the constant with respect to a cetain parameter (for example the dimension d in fll.55p ) or 
if we want to refer to this specific constant later in the paper. 

1.9 Historical orientation 

Numerical methods for parametric and stochastic PDFs using polynomials (or other approx¬ 
imation tools) in the parametric variable have been widely studied since the 1990’s. We 
refer in particular to |10l 061 |55l |56l |90] for general introductions to these approaches, and 
to [H 0111371 [53l EH [83l [H] for related work prior to the results exposed in our paper. 
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The approximation results presented in §3 have been obtained by the authors and their 
co-authors in a series of paper [221 EH EH], and the results in §4 on the evaluation of n- 
widths are from [22]. These results establish for the hrst time convergence rates immune to 
the curse of dimensionality, in the sense that they hold with infinitely many variables, see 
also [H] for a survey dealing in particular with these issues. In a similar infinite dimensional 
framework, and not covered in our paper, let us mention the following related works: (i) 
similar holomorphy and approximation results are established in [471145] 125] for specific type 
of PDEs and control problems, (ii) approximation of integrals by quadratures is discussed in 
[SQlini], (iii) inverse problems are discussed in [78l[79llH2], following the Bayesian perspective 
from [87], and (iv) diffusion problems with lognormal coefficients are treated in [50] ES] 145] . 

The sparse interpolation method presented in §6 is introduced and studied in [18]. See 
also [2] I4117I1172] for related work on collocation methods. Other non-intrusive methods 
are based on least-square regression as discussed in [TBl 14411441 EH], or on pseudo-spectral 
approximation as discussed in [911126]. The Taylor approximation algorithm presented in §7 
is introduced and studied in ini. Other intrusive methods based on Galerkin projection are 
discussed in [H El |241142] . 

Reduced basis methods have been studied since the 1980’s [74]. The greedy algorithms 
presented in §8 have been introduced and discussed in [SHI EHl EZl EH ES], and their conver¬ 
gence analysis was given in [6] and [42]. We refer to [52] for a general introduction on the 
related POD method, which is not discussed in our paper. 

Part I. Smoothness and approximation results 
2 Holomorphic extensions 

In this section, we discuss smoothness properties of the solution maps a u{a) and y u{y) 
which are central to the development of efficient numerical methods that are immune to the 
curse of dimensionality. We show that, under suitable assumptions on the parametric PDE, 
these maps admit holomorphic extensions to certain complex domains. Recall that a map 
F from a complex Banach space X to a second complex Banach space Y is said to be 
holomorphic on an open set D C X if for each x & V, F has a Frechet derivative dF{x) at 
X. Here dF{x) is a linear operator mapping X to X such that 


||F(a; + h)- F{x) - dF{x)h\\Y = o(||h|U), h e X. 


( 2 . 1 ) 


2.1 Extension of a u(a) for the model elliptic equation 

In order to formulate results on holomorphy, we need to introduce existence-uniqueness 
theory for solutions to fll.ip in the case that a is complex valued. 

We begin with our guiding example of the elliptic equation fll.5p . We now consider 


(X,R,W) = 


( 2 . 2 ) 
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as spaces of complex valued functions, and extend the standard variational formulation 
to such spaces in a straightforward manner: for a given a & X, and with / G W, hnd 
u = u{a) E V such that 

J aXu-Xv = {f,v)w,v, veV, (2.3) 

D 

where in the left integrand, 

m 

Xu • Vn := ^ (2.4) 

i=l 

is the standard hilbertian inner product, and (/, n) is the anti-duality pairing between W 
and V, which when / G L^{D) is given by the hilbertian inner product 


(/.“) 



(2.5) 


We recall that 

and 


v\\v := ||Vn||L 2 (^), 


|n||w := sup{(n,tc) : ||tc|lv < 1}. 


( 2 . 6 ) 

(2.7) 


This is therefore a particular case of a linear problem with the following general variational 
formulation. Let 03 denote the set of all sesquilinear forms dehned onVxV and let W = V* 
be the set of all antilinear functionals dehned on V, i.e., W is the antidual of V. We dehne 
the following norm on IB: 

||i?|| := sup \B{v,w)\. (2.8) 


Problem: Given B E ^ and L E W, find u E V such that 

B{u,v) = L{v), Vn G V. 


(2.9) 


The existence-uniqueness theory for such problems can be proven from the complex 
version of Lax-Milgram theorem given in Theorem 12.11 This theorem is a particular case of 
Theorem 12.21 proved below. To formulate this theorem and for later use, we introduce the 
notation C{X, Y) for the space of all linear operators T mapping the Banach space X into 
the Banach space Y with its usual norm 

||7'|l£(x,y) := sup -qi—(2 -10) 
xGX II 2^11 JV 

Given B E one has that B{u, ■) is an anti-linear functional and hence for any u E V, 
there is a Bu E W such that 


B{u,v) = {Bu,v)w,v, veV, 


( 2 . 11 ) 
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where (•, ■)w,v is the anti-duality pairing between W and V. Therefore, S is a linear operator 
from V into W and its norm is the same as that of B: 


||^||£(y,w) - ||-B||. (2.12) 

So the operator B is bounded and hence continuous. The problem fl2.9p is equivalent to the 
equation 

Bu = L, (2.13) 

set in W. With this notation and remarks in hand, we can now state the complex version 
of the Lax-Milgram theorem. 

Theorem 2.1 Assume that, B &is a sesquilinear form on V x V such that 

\B{u,u)\> a\\u\\\^, u E V, (2-14) 

for some a > 0. Then, B is is invertible and its inverse satisfies 

\\^~^\\c{w,v) < —• (2-15) 

a 

Thus, for each L E W, the problem fl2.9p has a unique solution ul = B~^{L) which satisfies 
the a priori estimate 

ll«ilk < hhn, ( 2 . 16 ) 

a 

For the particular B and L given by the left and right side of (12.3p . the ellipticity condition 
fl2.14p holds with a = r under the assumption 

3?(a(x)) > r, X E D, (2.17) 

since the latter implies, for all v E V, 

\B{v,v)\>ifi{B{v,v)) = y5R(a)|Vnp > r||Vn||i 2 = r||n|l^. (2.18) 

D 

Therefore, for any r > 0 we may extend the solution map a ^ u{a) of the elliptic problem 
(II.5p to the complex domain 

Vr-.= {aEX : 3fJ(a) > r}, (2.19) 

with the uniform bound 

\\u\\L°°{Vry) = sup{||M(a)||y : a E Vr} < ■ (2.20) 

This extension is therefore defined on the open set T) ;= 
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The fact that this map is holomorphic immediately follows by viewing it as a chain of 
holomorphic maps: introducing for any a the operator B{a) : v ^ —div(aVn) acting from 
V into W, we can decompose a u{a) into the chain of maps 

a 1 -^ B{a) I—)■ B{a)~^ B{a)~^f = u{a). (2-21) 

The first and third maps are continuous linear and therefore holomorphic, from X into 
C{y, W) and from £(1T, V) onto V respectively. The second map is the operator inversion 
which is holomorphic at any invertible B G W). 

For further purposes, it is interesting to compute the Frechet complex derivative du{a) G 
C{X,V) for the elliptic problem fll.Sp . Fix an a G and let h G X be such that Hh-Hx < 
Then, the solution map is also dehned at a + h G Pr. Substracting the variational 
formulations fl2.3p for u{a + h) and u{a), we hnd that 


aX{u{a + h) — u{a)) ■ Xv = — hXu{a + h) ■ Xv. 


( 2 . 22 ) 


D 


D 


We hrst use this identity to obtain a Lipschitz continuity bound: by taking v = u{a+h)—u{a) 
and taking the real part of both sides, we find that 

r\\u{a + h) — u{a)\\y < ||h||x||M(a + h)\\v\\u{a + h) — u{a)\\v 


and therefore 


< + h) - u{a)\\v, 


\u{a + h)-u{a)\\v<C\\h\\x, 


(2.23) 


We next show that du{a)h can be defined as the solution w = w{h) G Id to the problem 


aXw■Xv= 


hXu{a) ■ Xv, V eV, 


(2.24) 


D 


D 


which is well-posed in the sense of the above Lax-Milgram theorem. Indeed, on the one 
hand, the dependence of tc on h is linear and it is continuous because taking v = w we find 
that 

\\w\\v<C\\h\\x, 6 *:=^^. 


On the other hand, the remainder g = u{a + h) — u{a) — w is the solution to 
J aXg ■ Xv = j h{Xu{a) — Xu{a + h)) ■ Xv, v E V, 

D D 

which by taking v = g and using fl2.23p gives the quadratic bound 
< hlftlUIWo) - U(a + ft)lk < c\\h\\\, C := 


(2.25) 


(2.26) 


(2.27) 


This conhrms that du{a)h = w{h). 
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2.2 Extensions by the Ladyzhenskaya-Babushka-Brezzi theory 

Our next goal is to treat more general linear parametric problems that are not necessarily 
elliptic. In particular, we have in mind parabolic problems such as the heat equation, or 
saddle points problems such as the Stokes equations. In order to formulate this general class 
of problems, we suppose that V and V are two complex Hilbert spaces with inner products 
(•, ■)v and (•, •)y, respectively. So, for example, for every n G H, {v, ■)v is an anti-linear 
functional on V and {■,v)v is a linear functional on V, and the same holds for V. We let 
W := V* denote the space of all anti-linear functionals on V, i.e. W is the anti-dual space 
of H. 

We now denote by 03 = 03(1/, H) the set of all such sesquilinear forms on V x V and 
introduce the following topology on 03, 

||i?|| = max \B{v,w)\ (2.28) 

||j;||v = l,||u^||^=l 

As in the previous section, given i? G 18, one has that B{u, •) is an anti-linear functional on 
V. Therefore, as in the previous section, we can dehne the linear operator B that maps V 
into W by fl2.1ip and we again have ||H||/;(y,vv) = ||-B||. 

We consider the following general problem. 

Problem: Given H G 03(1/, V) and L G W, find u E V such that 

B{u,v) = L{v), \/veV, (2.29) 


This problem is again equivalent to the equation Bu = L. In order to establish the existence- 
uniqueness for complex formulations of such problems, we use the following complex valued 
version of Ladyzenskaya-Babushka-Brezzi theorem. 

Theorem 2.2 Assume that B G 03(1/,!/) satisfies 


, \B{u,v)\ \B{u,v)\ 

mt sup r® ^ n—n—ini“ — 


uev 


Imllvll'^lly v€V ueV 


(2.30) 


for some a > 0 Then, the operator B defined via (12.111) is invertible and its inverse satisfies 


\\B '^Wawy) < —• (2-31) 

a 

Hence, for each L G W, the problem fl2.29p has a unique solution ul = B~^{L) which satisfies 
the a priori estimate 

IIklIIv- < —. (2.32) 

a 

Proof: From the hrst inf-sup condition in (12.301) we obtain that 

ttll^lly < ||Bn||w, u eV. (2.33) 
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This shows that B is injective and that its range B(y) is closed in W. In order to prove 
that B is invertible we need to show that B(y) is all of W. We prove it by contradiction; if 
B{V) was strictly contained in W, we can pick a non-trivial w & W which is orthogonal in 
W to all elements of B{V). Then, we define v = v{w) in the antidual of W, that is, n G Id 
by setting 

v:e^{e, w)^ = (e, v)^y. (2.34) 

It follows that 

B{u, v) = {Bu, v)y,y = v{Bu) = {Bu, = 0, (2.35) 

for all u E V. This contradicts the second inf-sup condition. Hence B is invertible and the 
bound fl2.3ip on its inverse follows from (I2.33p . □ 

Remark 2.3 One particular case of Thcorem A2.2\ is Theorem \2.1\ since (I2.14p implies (I2.30p 
in the case V = V. 

The argument, centering on fl2.2ip . which justified the holomorphy of the solution map 
for our canonical elliptic setting, may be generalized to any linear problem of the form 

B{a)u = /(a), (2.36) 

where a i—)■ B{a) and a i—)■ /(a) are holomorphic maps from an open set P C X into 
C{y,W) and W respectively, where (X, V, IT) are complex Banach spaces. Namely, if B{a) 
is invertible for all a G P, we hnd by 

a B{a) i-G- B{a)~^ ^ B{a)~^f{a) = u{a), (2.37) 

that the solution map is holomorphic over V. 

In particular, we may consider the following parametric linear problems of the form fl2.29p 
for a pair of Hilbert spaces (T, V)\ for a given a G X, find u{a) G V such that 

B{u{a),v] a) = L{v] a), veV, (2.38) 

where B{-, ya) and L{-]a) are continuous sesquilinear and antilinear forms over V xV and 
V respectively, which depend on a G X. 

Corollary 2.4 If the assumptions of Theorem 12.21 are satisfied for the problem (I2.38p for 
each a in a set V G X, then the operator B{a) defined by B{u,v;a) = {B{a)u,v)y,y is well 
defined and invertible from V to W = V* for all a E V and the solution map a u{a) is 
well defined from V into V. If the constant a > 0 can be chosen independent of a E V and 
if sup ||L(-, a)||vi/ =: M < oo, then the solution map is uniformly bounded on V with 

M 

\\u\\l^{v,v) = sup \\u{a)\\v < —. (2.39) 

aev tt 

In addition, ifV is an open set and if the maps a L(-; a) and a ^ B{-, •; a) are holomorphic 
from V into W and V into IB = ^(y,V), respectively, then the solution map a u{a) is 
holomorphic overT>. 
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We next give two examples that fall in this general framework. The hrst one is a linear 
parabolic equation with parametrized evolution operator. As a simple model, we consider 
the heat equation parametrized by its diffusion coefficient: 


dtu = div(aVM) + /, in ]0, T[xD, (2.40) 

where D C is a Lipschitz domain, / G L‘^{]0,T[] H~^{D)), and the initial and boundary 
value conditions are 

u\t=o = uoe L^{D) and u\qd = 0. (2.41) 

It is well known that a solution space for this PDE is 

E := L^Qo, T[; Hl{D)) n T[; H-\D)). (2.42) 

We obtain a space-time variational formulation of the type fl2.38p by introducing the auxiliary 
space 

E ;= L\%T[,Hl{D)) X L\D)), (2.43) 

and dehning for a G X := u E V and v = (ui, ^ 2 ) G V, 


B{u,v;a) := / / (^dtu{x, t)vi{x, t) + aVu{x, t) 


Vw] 


(x, t)jdxdt 


u{-,0)v2{x)dx, (2.44) 


0 D 


D 


and 

T 

L{v;a):= j {f{-,t),vi{-,t))dt + j uo{x)v2{x)dx, (2.45) 

0 D 

where {•, •) is the anti-duality pairing between H~^{D) and Hq{D). 

The fact that these are bounded sesquilinear and antilinear forms follows readily from 
the choice of spaces X, V and V. By using the general arguments from [ST], one can show 
that whenever the diffusion coefficient comes from the uniform ellipticity class of fl2.17p . 
then the inf-sup condition fl2.30p holds, with the values of a in fl2.30p depending on that of 
r. Therefore, from Theorem 12.21 the solution map a 1 —)■ u{a) is dehned on with a uniform 
bound 

\\u\\L°°{Vry) = sup{||M(a)|lv : a G Vr} < Cr- (2.46) 

Since r > 0 is arbitrary the solution map is therefore dehned on the open set "D := Ur>oT^r, 
and its holomorphy follows from Corollary 12.41 since the sesquilinear form i?(-, qa) depends 
on a in an affine manner. 

The second example is a linear elliptic PDE parametrized by the shape of the physical 
domain. As a simple model we consider the Laplace equation 

-Aw = l, (2.47) 
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set on a domain Da C with homogeneous Dirichlet boundary conditions w\g£,^ = 0. Here 
a describes the shape of the domain Da in polar coordinates, according to 

Da'.= {x = {pcosO, psm6) : 0 < p < a{6)}. (2.48) 

In order to obtain a Lipschitz domain, we take a G X with 

X-H^y(L“(10,2ir[), (2.49) 

the space of 27r periodic Lipschitz continuous functions, which is equiped with the norm 

||a||x := ||a||L°°([o,27r[) + ||a^||L°°([o,27r[)- (2.50) 

If in addition, for some r > 0, we have 

a{9)>r, 0G[O,27r[. (2.51) 

then Da is a Lipschitz domain and thus there exists a unique solution w = w {a) G HQ{Da) 
to fl2.47p in the sense of the variational formulation 


J'Vw-'Vv = J V, n G iLo(/la)- (2.52) 

Da Da 

Note that fl2.5ip implies that Da is star-shaped with respect to a ball of sufficiently small 
radius centered at the origin. In order to study the parametric smoothness of a e-)> w{a), we 
need to represent the solution in a function space V which does not change with a. One way 
to do this is to utilize the pullback of the solution to a reference domain D under a suitable 
transformation Fa which maps D into Da- A natural choice is to take for D the unit disc of 
centered at the origin, and use the transformation 


Fa{p cos 9, psin 9) := {a{9)p cos 9, a{9)psin 9). 


For X = {pcos9, psin9), the Jacobian matrix of Fa{x) is given by 


/ a{9) cos 9 a'{9) cos 9 — a{9) sin 9 \ 
y a(6*) sin 6* a'(0) sin6 '-|-a(0) cos6' y’ 


(2.53) 


(2.54) 


and its determinant by 
We denote by 


Ja{x) := a{9Y >r^>0. 


(2.55) 


u = w o Fa eV := Hq{D), (2.56) 

the pullback solution and study the solution map a e- )■ u{a). Using Fa as a change of variable 
in (I2.52p . we find that u satishes 


J MaVu ■ Vn 

D 


j JaV, n G U, 
D 


(2.57) 
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that is, the variational formulation of the equation 


- div(M„VM) = Ja, (2.58) 

set over the domain D with homogeneous Dirichlet boundary conditions, where 

hUx) - Mx)dF-'(x)dF-‘(x) = ^ j , b{ff) := (2.59) 

We define the complex extension as a solution of (12.381) . with V = V = Hq{D) and the forms 
B and L given by and 


B{u,v]a)\= / Ma'Vu-'Vv and L{v,a):= / JaV. 


D 


D 


(2.60) 


In view of the expressions of and J^, it is readily seen that a L{-]a) is holomorphic 
from X onto W = V* and that a i—)■ B{-,-]a) is holomorphic from the open set of X of 
nowhere vanishing functions into 53 = 53(1/, 1/). It remains to understand for which a G X 
the problem has a solution. Introducing the real symmetric matrix Ra{x) = 3?(Ma(x)) and 
denoting by Amin(a,x) its smallest eigenvalue, we have for any u eV, 


^{B{u, u; a)) 


D 


min \rain{a,x). 

xGD 


(2,61) 


Therefore the coercivity condition (12.141) holds if Amin (a) > 0. A straightforward computation 
shows that 


det(i?a(a:)) = 1 - and tr{Ra{x)) = 1 + 3fJ(6(0))2 - (2.62) 

We are thus are ensured of the existence of the solution u{a) E X for those a E X which are 
nowhere vanishing and such that 

|3(K«))I<1. [0,2x1, (2.63) 

a[U) 

By application of Corollary 12.41 the solution map has a holomorphic extension onto the open 
domain V E X consisting of those a E X which are nowhere vanishing and such that fl2.63p 
holds. 

One important observation for this last example is the following: if the parameter domain 
is a compact set of real valued functions in X such that fl2.5ip holds for all a E A, then there 
exists an open neighbourhood O oi A in the complex valued X such that the holomorphic 
extension of the solution map is uniformly bounded over O. Indeed, in view of the above 
remarks, for every a E A, there exists e = e{a) > 0 such that 

B{a,e):={d : ||a — a||x < C "D, (2.64) 
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and such that the assumptions of Theorem 12.21 are satished for the problem fl2.38l) with 
constants a and Cl that are uniform over B{a,e). By compactness of A, we may dehne O 
as a hnite cover of A of the form 

M 

O = \^B{ai,e{ai)), (2.65) 

i=l 

for {oi,..., om} £ A, so that a u{a) is holomorphic and uniformly bounded over O. 


2.3 Extensions by the implicit function theorem 


In this section, we consider a further generalization of problems of the form fll.ip for which 
we can prove holomorphy of the solution map on certain subsets of the complex Banach 
space X. In particular, this generalization can be applied to certain nonlinear PDEs. As a 
simple example, to motivate what follows, we consider the nonlinear elliptic equation 

— div(aV'u) = /, (2.66) 


set on a bounded Lipschitz domain D C 'MX where m = 2 or 3, with homogeneous Dirichlet 
boundary conditions uiqd = 0, parametrized by the diffusion coefficient a. Similar to the 
linear equation fll.51) . we set 

(X, y, IT) = (L-(D), H-\D)), (2.67) 


and consider for f E W and a E X the variational formulation 

/ u^v + / aVuVv = {f,v)w,v, v eV. 


D 


D 


( 2 . 68 ) 


By the theory of monotone operators, see for example Theorem 1 in Chapter 6 of ua, and 
using the Sobolev embedding Hq{D) C L‘^{D), one can easily check that for any real valued 
a E X such that a > r for some r > 0, there exists a unique solution u{a) to fl2.68l) which 
satishes the a priori estimate 

||M(a)||y < Mk. (2.69) 

r 

However, we cannot use monotone operator theory to derive a solution u{a) to fl2.68l) 
for complex valued a since in this form the monotonicity is lost. One could consider an 
alternative extension of (12.681) to complex valued functions given by 

\u\‘^u — div(aVM) = /. (2.70) 


For this equation, one can now apply monotone operator theory to the real and imaginary 
part of the equation in order to show that the problem is well posed when 3fJ(a) > r for some 
r > 0. However, this extension is not holomorphic in the variable a due to the presence of 
the modulus in (12.701) . We thus want to adhere to the original problem (I2.66p for complex 
valued a, but hnd an alternative to using monotone operator theory. This alternative is 
provided by the following general theorem, which is based on the holomorphic version of the 
implicit function theorem in Banach spaces. 
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Theorem 2.5 Let V : V x X ^ W where X, V and W are complex Banach spaces and let 
A C X be a compact set such that for each a E A, there exists a unique solution u{a) G V 
to fll.ip . Assume, in addition, that there exists an open set V of X containing A for which 

(i) V is a holomorphic map from V x V to W, 

(ii) for each a E A, the partial differential duV{u{a),a) is an isomorphism from V to W. 

Then, there exists an open setO <Z X containing A, such that u has a holomorphic extension 
to O which takes values in V and which is uniformly bounded: 

\\u\\L^{oy) = sup ||M(a)||y < oo. (2.71) 

aGO 

Proof: Let a E A. The assumptions (i) and (ii) allow us to apply the holomorphic version 
of the implicit function theorem on complex Banach spaces, see [33l Theorem 10.2.1], and 
conclude that there exists an e = e{a) > 0, and a unique holomorphic extension of u from 
the open ball B{a,e) of X, with center a and radius e into V, such that V{u{h),b) = 0 for 
any b E B{a,e). In addition, the map u is uniformly bounded and holomorphic on B{a,e) 
with ^ 

dub =—(^duV{u{b),b)^ o dVb{u{b),b), b E B{a,e) . (2.72) 

From the compactness of A, we may dehne 0 as a hnite cover of A of the form 

M 

0 = [_}B{ai,e{af)), (2.73) 

i=l 

for {oi,..., om} G a. Therefore u has a uniformly bounded holomorphic extension over O. 
□ 


There are many settings where Theorem 12.51 can be applied, including nonlinear equa¬ 
tions. As an example, we return to fl2.66p where the operator V is given by 

V{u, a) = — div(aVu) — /, (2.74) 


or in variational form by 

{V{u,a),v)wy = J u^v + J aXuXv — {f,v)wy- (2.75) 

D D 

Using the fact that Hq{D) is continuously embedded into L^{D), it is easily seen that V acts 
as a holomorphic map from V x X to W, and therefore assumption (i) holds. 

We now take for A any compact set of X contained in the set of real valued functions 
a E X such that a > r where r > 0 is hxed, so that there exists a unique solution u{a) E V 
for each a E A. We observe that, for a E A, 

du'P{u{a), a){w) = 3u{a)‘^w — div(aVtc). (2.76) 
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The operator duV{u{a),a) is associated to the sesquilinear form 


A{v,w; a) = {duV{u{a), a){v),w)v',v = J 3u{aYvw + J aVv-Vw. (2.77) 

D D 

which is continuous over V x V (by the continuous embedding of Hq{D) into L^{D)) and 
satisfies the coercivity condition 

|yl(n, n; a)| > 3?(y4(n, n; a)) > r||n|ly, v E V, (2.78) 

By the complex version of Lax-Milgram theorem, duV{u{a), a) is thus an isomorphism from 
V onto W, and therefore assumption (ii) holds. We therefore conclude from Theorem l2.5l that 
there exists an open set O <Z X containing A, such that the solution map has a uniformly 
bounded holomorphic extension over O. 

Remark 2.6 Theorem \2.5\ may also be applied to treat linear parametric problems such as 
the previously discussed elliptic, parabolic or domain dependent elliptic equations which are 
already covered by the LBB theory. Its weakness however is that it does not give an explicit 
description of the domain where the holomorphic extension is defined, in contrast to the 
explicit conditions on a that can be established for these specific problems using Theorem 
12.21 Nevertheless, as it will be seen further, the sole existence of a holomorphic extension 
on an open neighbourhood of the compact parameter domain A turns out to be sufficient 
for deriving approximation results for the solution map which are immune to the curse of 
dimensionality. 

2.4 The uniform ellipticity assumption 

For the remainder of this section, and all of §3, we assume that the parameter space X is the 
complex Banach space L°° and that the parameter set A(Z X has an affine scalar representa¬ 
tion of the form fll.lSp . This allows us to view the solution map as y u{y) := u{a{y)). The 
focus of thie remainder of this section is to show that this map has a holomorphic extension 
to certain complex domains. 

Here and in §3, we assume that Assumption A holds for a suitable affine representer 
which means that the solution map a u{u) is well defined over 

a{U) := |a(|/) = a + '^yj^pj : y = {yj)j>o e t/|, (2.79) 

j>i 

where U = [—1,1]^, so that the map y u{y) is well defined from U to V. We also assume 
that 

(ll^tlU),>ie^'(N), (2.80) 

which implies that the series in (ITlbh converge absolutely for all y E U. In addition, this 
assumption guarantees the compactness of the set a{U) defined by fl2.79p . as shown by the 
following result. 













Lemma 2.7 Under the assumption fl2.80p . the set a{U) defined by fl2.79p is compact in X. 


Proof: Let (a„)„>i be a sequence in a{U). Since (||'0ilU)j>i G the sequence {an)n>i is 

bounded in X. Each On is of the form a„ = Vndfi’j- Using a Cantor diagonal argument, 
we infer that there exists y* = (2/j)j>i G U such that 

lim = y*, j > 1, (2.81) 

n—>-+oo 

where (cT(n))„>i is a monotone sequence of positive integers. Dehning a* := J2j>iyj'^j ^ 
a{U), we may write for any k > 1, 


|®(T(n) O II X ^ 


i=i 


X 


11"^. 






(2.82) 


It follows that ao-(n) converges towards a* in X and therefore a{U) is compact. □ 


Let us recall the four previously discussed examples of parametric PDEs, that is, equa¬ 
tions (ll.5|i . (I2.40p . (I2.58ji . and (I2.66ji . For these problems, we have seen that the solution 
map is dehned at any real valued a G X satishes 

a{x) > r, X e D, (2.83) 

for some r > 0. Here, the physical domain D is replaced by the angular domain [0,27r[ in 
the case of (I2.58p . When .4. is a compact set of the form (I2.79p . this condition is met for all 
a G 4. if and and only if 

a{x,y) = a(x) + '^yjfij{x) > r, x E D, y eU. (2.84) 

i>i 

By taking the particular choice yj = —sign.{'ipj{x)), we hnd that the above inequality is 
equivalent to 

^ |'0j(3^)l < 0 ( 3 ^) ~ xeD. (2.85) 

i>i 

We refer to (I2.84p or (I2.85p as the uniform ellipticity assumption of constant r, or UEA(r). 
In the case of fl2.58l) . the physical domain D is replaced by the angnlar domain [0,27r[, and 
UEA(r) thus means that, for all a E A, the domains Da are star-shaped with respect to a 
ball of snfficiently small radius centered at the origin. 

Our previous analysis showed that the assumption UEA(r) ensures that in all four 
examples the solution map y 1 —)■ u{y) is well dehned from U to V. We now want to build an 
extension z i-p- u(z) by setting 

a(z) = a + Zjfij, (2.86) 

i>i 

for snitable z = E and dehning 

u{z) := u{a{z)). (2.87) 
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This only makes sense for those z G for which a{z) is well dehned and falls inside the 
domain V G X where a ^ u{a) admits its holomorphic extension. At such a z, the chain 
rule ensures that the resulting map z u{z) is holomorphic in each variable Zj, with partial 
derivatives given by 

dzjU{z) = du{a{z))'ijjj. (2.88) 

Our next objective is to describe some relevant domains of on which the holomorphic 
extension z u{z) exists and is uniformly bounded. 

2.5 Holomorphic extensions of y i-A u{y) on polydiscs 

We hrst consider the elliptic problem fll.Sp and the parabolic problem fl2.40l) . For such 
problems, we have seen that the solution map a u{a) admits a holomorphic extension on 
the complex domain defined by the condition 3fJ(a) > r, with uniform bound 

|| w || l °°( d ,., v ) < Cr- (2.89) 


If UEA(r) holds, then 

3?(a(a;, z)) = d{x) + ‘Sl{zj)ipj{x) > r, x ^ D, (2.90) 

j>i 

holds for all 2 ; G such that |3fJ(zj)| < 1, and in particular for all z eU, where U is the 
unit polydisc 

U:={z = : \zj\ < 1} = < 1}. (2.91) 

This shows that the set 

a{U) := ^a{z) =d+'^^Zj'ijjj : zElJ^, (2.92) 

i>i 

is contained in In turn, the map 2 ; 1 —)■ u{z) is holomorphic in each variable Zj over U 
with the uniform bound 

sup ||■u(2;)||y < Cr- (2.93) 

z&U 

We next consider general polydiscs of the form 

lAp . \^z • \^j\ — Pj} ^ Pj} y (2.94) 

where p = {pj)j>i is a sequence of positive numbers. Then, for any t > 0 and any positive 
sequence p that satisfies the constraint 

^d{x) —t, X E D, (2.95) 

i>i 

we find that 

z ElAp ^ 3f?(a(x, z)) > t, X E D, (2.96) 
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which shows that a{Up) C Dt. Therefore, the map z ^ u{z) is holomorphic in each variable 
Zj over Up and the uniform bound 02.931) now holds for some constant Ct > 0. 

If UEA(r) holds and if 0 < t < r, we can find sequences p which satisfy 02.95P and such 
that pj > 1 for all j > 1, so that the polydisc Up contains U. In particular, let e := r — t > 0 
and p = {pj)j>i be any sequence of numbers such that pj > 1 for all j > 1 and that satisfies 
the constraint 

J>1 

Then, using UEA(r), we have 

'^Pj\i}j{x)\ < ^ - l)\\'ipj\\x < ^x) - r + e = a(x) - t, x e D. (2.98) 

i>i i>i i>i 

Therefore the map z ^ u{z) is holomorphic in each variable Zj over Up with again a uniform 
bound 

sup ||M( 2 ;)||y < Ct- (2.99) 

z^lAp 

We shall make further use of the following observation: if p satisfies one the above 
constraints fl2.95p or 02.971) . then for each j > 1, there is an open set Op. C C that contains 
the disc {\zj\ < pj} and such that the map z ^ u{z) is holomorphic in each variable Zj over 
the tensorized set 

Op:=®i>iOp,. (2.100) 

One possible choice is to take for Op. the open disc 

^pj {l^jl < Pj}y pj Pj + II / II ) (2.101) 

l^j>i\Wj\\x 

for some 0 < f < f, since we then have 

'^pj\'iljj{x)\<^Pj\'iljj{x)\+i<a{x)-{t-i), xeD, (2.102) 

i>i i>i 

which shows that a{Op) C 

In §3, we exploit these domains of bounded holomorphy in order to derive convergence 
results for polynomial approximations of the type fll.SOp that are obtained by truncation 
of the Taylor development of u{z) on suitable sets A„. For now, let us observe that the 
varying radii pj in each variable of the polydiscs Up reflect the anisotropy of the solution 
map 2 ; I—)■ u{z). Let us also note that the above discussion does not identify one particular 
polydisc Up. Instead, it shows that bounded holomorphy holds on all of the polydiscs Up 
associated to any of the sequences p which satisfy the constraint (12.971) or fl2.95p . 

Let us next observe that the above procedure of extending the solution map to poly¬ 
discs is not restricted to just the problems of Examples 1 and 2. More generally, we can 
obtain bounded holomorphic extensions on similar polydiscs with constrainted radii, for any 
parametric PDE that satisfies the assumptions of the following theorem. 
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Theorem 2.8 Consider a parametric problem of the form fll.ip such that Assumption A 
holds for a suitable affine representer {'4>j)j>i- Assume that (||'0j||x)i>i ^ and that 

the solution map a ^ u{a) admits a holomorphic extension to an open set O <Z X which 
contains the set a{U) defined by fl2.92p . with uniform bound 

sup ||M(a)||y < C. (2.103) 

aeo 

Then, there exists £ > 0 such that for any sequence p = {pj)j>i of numbers larger than or 
equal to 1 which satisfies the constraint fl2.97p . the following holds: for all j > 1, there exists 
an open set Op. C C that contains the disc {\zj\ < pj} for which the map y ^ u{y) admits 
an extension to the set Op defined by fl2.100p . and this extension is holomorphic in each 
variable Zj with uniform bound 

sup ||m(z)||\/ < C, (2.104) 

z^(D p 

with the same value of C as in fl2.103p 

Proof: We first observe that there exists 5 > 0 small enough such that the 5-neigbourhood 
of a{lA) is contained in O, i.e. 

IJ B{a,5)^0. (2.105) 

aGa{U) 

To see this, we observe that, by the same argument as used for a{U) in the proof of Lemma 
12.71 the set a(U) is compact. The distance function 

a 1 -^ dist(a, (9'^) := inf{||a — &||x : b ^ O}, (2.106) 

is continuous and strictly positive over a(U). By compactness of a(U), it reaches a strictly 
positive minimal value, and therefore fl2.105p holds by taking 5 > 0 strictly smaller than this 
minimal value. 

Next, we take £ > 0 strictly smaller than 6 and dehne 

S s 

^pj {1^11 ^ Pi}’ Pi Pi II / II ’ (2.107) 

l^j>i \Wi\\x 

so that by fl2.97p . we have 

5^(Pi - l)ll^ilU = 5^(Pi - l)ll^ilU + 5 - £ < 5. (2.108) 

i>i i>i 

For any 2 ; G Op, we define Zj := Zj min{l, which gives that 5 := {zj)j>i is in U and 

a{z) = a + Zj'tfj 
i>i 

= a + ^ Zjipj + '^{zj - Zj)^)^ 
i>i i>i 

= a{z) + r{z). 
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Since, 

\\riz)\\x < ~ l)llV’illx < <5, (2.109) 

i>i i>i 

it follows from (12.105^ that a{z) G O. Therefore y u{y) admits a holomorphic extension 
over Op with at least the same uniform bound. □ 

2.6 Holomorphic extensions of y u{y) on polyellipses 

The reader should notice that the results of the last section on extensions of the solution 
map to polydiscs were not applied to two of our main examples: the parametrized domain 
problem fl2.58p and the nonlinear problem fl2.66p . For such problems, in contrast to the 
elliptic and parabolic problems fll.Sp and fl2.40p . we are not ensured that the solution map 
a ^ u{a) admits a holomorphic extension to the whole domain defined by the condition 
3f?(a) > r. In turn, while the uniform ellipticity assumption UEA(r) ensures that the map 
y u{y) is well defined over U, it does not allow us to define its holomorphic extension on 
the polydisc U, or more generally on polydiscs Up for sequences p which satisfy (I2.95p . 

On the other hand, for both problems fl2.58p and fl2.66p . we have seen that if A is any 
compact set of real valued functions in X such that a > r for all a E A, there exists an open 
set (9 C A which contains A and such that the solution map a ^ u{a) admits a holomorphic 
extension on O. In particular, if UEA(r) holds and (||'0il|x)j>i £ we are ensured 

that such an open set exists for A = a{U) defined by fl2.79p . This allows us to define a 
bounded holomorphic extension to y ^ u{y) on complex domains that contain U, however 
with shorter extensions in the imaginary axes than the polydiscs Up. In §3, we exploit 
these domains of bounded holomorphy in order to derive convergence results for polynomial 
approximations of the type fll.hOp that are obtained by truncation of the development of 
u{z) into orthogonal Legendre series. 

To formulate the extensions we seek, we introduce some standard concepts from complex 
analysis. For any real number s > 1, we define in C the so-called Bernstein ellipse, 

£.:={ZZfl: lz\=s]. ( 2 . 110 ) 

which has semi-axes of length in the real axis and in the imaginary axis. Note 

that in the limit s —)> 1, we obtain Si = [—1,1]. The convex hull of Sg is given by the £lled-in 
ellipse 

•H. l<|z|<s|. (2.111) 

Note that 

[-l,l]cngC{\z\<s}, (2.112) 

Therefore, defining for any sequence {pj)j>i of numbers strictly larger than 1 the polyellipse 

Sp := ®j>iSp. (2.113) 
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and the filled-in polyellipse 


"Hp := 


(2.114) 


we find that 

U cKpC Up. (2.115) 

However, the set Tip has much shorter extension than Up in the imaginary axis for the 
coordinates j for which pj is close to 1. This allows us to derive bounded holomorphic 
extensions on such domains for the solution map z u{z) in the case of problems (I2.58p 
and fl2.66p . and more generally for any parametric PDE that fall under the assumptions of 
the following result. 

Theorem 2.9 Consider a parametric problem of the form fll.ip such that Assumption A 
holds for a suitable affine representer Assume that (Wfjj ||x)j>i e £^(N) and that 

the solution map u u{a) admits a holomorphic extension to an open set O G X which 
contains the set a{U) defined by (12.791) . with uniform bound 

sup ||M(a)||y < C. (2.116) 

a&O 

Then, there exists e > 0 such that for any sequence p = {pj)j>i of numbers strictly larger 
than 1 that satisfies the constraint fl2.97l) . the following holds: for all j > 1, there exists 

an open set Op. C C that contains the filled-in ellipse Tip. and such that the map y u{y) 

admits an extension over the set Op defined by fl2.100p . which is holomorphic in each variable 
Zj with uniform bound 

sup ||M( 2 ;)||y < C, (2.117) 

z&Op 

with the same value of C as in fl4.4p 

Proof: By the same compactness argument as used for a{U) in the proof of Theorem 12.81 
there exists 5 > 0 sufficiently small that the 5-neigbourhood of a{U) is contained in O, i.e. 

IJ B{a,S)cO. (2.118) 

aGa{U) 

We now define e = 6 and set Op^ to be the oval-shaped domain 

Op.:={z&C : dist( 2 ;, [—1,1]) := min \z — y\<pj — l}, (2.119) 

for which is is easily checked that Tipj C Op^. For any 2; = {zj)j>i G Op, there exists a 

y = i.yj)j>i ^ ^ such that 

\yj - Zj\ < Pj - 1, j>l. (2.120) 

We may therefore write 

a{z) = a + ^ Zj'ijjj = a{y) + r{z), (2.121) 

i>i 
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where 


( 2 . 122 ) 


a{y) = a + ^yjiJj e a{U), 
i>i 


and 

lk(^)|U < Y1 1% < 5^(Pi - l)llV'ilU <e = 5. (2.123) 

i>i i>i 

It follows from fl2.105p that a{z) G C>. Therefore ^ u{z) admits a uniformly bounded 
extension over Op. □ 

Remark 2.10 A general setting in which the existence of O in the above result is satisfied 
is provided by Theorem \2.5[ 


Remark 2.11 The assumptions of Theorem \2.9i are obviously weaker than those of Theorem 
\2.t^ since a{U) is replaced by a{U). 

Remark 2.12 Theorems \2.tA and \2.9\ can be formulated for a general map u from A to V 
that is not necessarily the solution map of a parametric PDE. Indeed, the only assumptions 
on u which are used in the proof of these results is that it admits a bounded holomorphic 
extension in neighborhoods of aiU) or a{U). In other words, the same conclusions in these 
theorems hold for any map u which admits a bounded holomorphic extension on an open 
set containing a(U) or a{U). On the other hand, as seen in §2.1, §2.2 and §2.3, the fact 
that u is the solution map to a parametric PDE can be utilized to prove the validity of these 
assumptions. 


3 Best n-term polynomial approximations 

In this section, we place ourself in the same framework as in ^2.41 we consider a parametric 
problem fll.ip . and assume that Assumption A holds for a suitable affine scalar represen¬ 
tation fll.lSp . The solution map y hg- uiy) := u{a{y)) is then well dehned from U to V. Our 
goal is now to establish convergence rates for specihc separable approximations of this map 
which are polynomials in the y variable. We construct these approximations by suitable 
truncations of inhnite expansions. 

3.1 Approximation by n-term truncated expansions 

Let us begin with some general remarks concerning the convergence towards u of separable 
expansions of the form 

(3.1) 

where acts from 1/ to M and u„ G V, for some countable index set IF. 


35 

















Definition 3.1 A sequence (A„)„>i of finite subsets of iF is called an exhaustion of iF if 
and only if for any z/ G there exists Uq such that i/ G for all n > Uq. Here we do not 
impose that if (An) = n. 


Definition 3.2 The series fl3.ip is said to converge conditionally with limit u in a given 
norm || ■ || if and only if there exists an exhaustion (A„)„>i of H such that 

= 0. (3.2) 

The series fl3.ip is said to converge unconditionally towards u in the same norm, if and only 
if fl3.2p holds for every exhaustion (A„)„>i of H. 

We are interested both in establishing unconditional convergence and providing estimates 
for the approximation error. One first instance where this is feasible is when {4>u)ueT is an 
orthonormal basis, as indicated by the following result which simply gathers well known facts 
from Hilbert space theory. 

Theorem 3.3 Let {(f>v)u&r be an orthonormal basis of L'^{U,p) for some given measure 
on U, and let u G L‘^{U, V, pi). Then, the inner products 

Uy ;= j u{y)(t)y{y)dp{y), n e F', (3.3) 

u 

are elements ofV, and the series fl3.ip converges unconditionally towards u in L^{U,V, p), 
with the error given by 

M ^ Uy(j)y 

I'GAn '' ' 

for any exhaustion (A„)„>i. 

Let us observe that if p is any probability measure, the L°°{U,V) norm controls the 
Lfi{U, V, p) norm. In the previous section, we have given various examples for which we are 
ensured that u is uniformly bounded over U, and we may therefore apply the above result 
whenever /r is a probability measure. 

We next give a general result which can be used to establish convergence and give error 
bounds in the L°° norms for truncating the expansion fl3.ip . 

Theorem 3.4 Consider an expansion fl3.ip for which the following hold: 

(i) The functions fiy : U M. are such that ||</>y|lL°o(;7) = 1, for all v ^ F . 

(ii) The functions Uy are in V and (||■u^y||y)^/G:F G i^{F), 

Then, whenever the expansion fl3.ip converges conditionally to a function u in L°°{U, V), it 
also converges unconditionally to u in L°°{U,V), and for any exhaustion (A„)„>i, we have 
the error estimate 

U - ^ Uyfiy 


< 




E 


\Uy\\V- 


(3.5) 


T.‘2(TLV.ik 


E 


\u 


I^WV 


i/Z 


(3.4) 


lim 

n^+oo 


U 


E 


u,. 
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Proof: Let (A„)„>i be any given exhaustion of and suppose that e > 0 is arbitrary. We 
know that there exists an exhaustion (A*)„>i and an such that 


\u 




■'ga* 


L°°{uy) 


< e, n > riQ. 


In addition, there exists m larger than uq such that 


\\uu\\v < 


(3.6) 


(3.7) 


Since (A„)„>i is an exhaustion, 
therefore 


\u 


T'e At, 


u,. 


L°°{U,V) 


< \\u 


This conhrms the unconditional 
of the triangle inequality. 


there exists ni such that AJ^ C An for all n > ni, and 


\\uu\\v <25, n> rii. (3.8) 

roo/rri/\ f ^ 




L°°{U,V) 




convergence. The estimate 03.51) follows by an application 

□ . 


In the particular case where {(t)v)v&T is an orthogonal basis normalized in L°°, the next 
theorem shows the same result holds without the need to assume conditional convergence. 


Theorem 3.5 Let {cl)^)u&j^be an orthogonal basis of for some given probability mea¬ 

sure p on U, normalized so that ||0 i/||l°°(;7 ) = 1 for all u & if. If u E L‘^{U,V,yi) and the 
inner products 


Uu := 




u{y)(t)u{y)dp.{y), V 




u 


(3.9) 


satisfy (||My||y),ygj- G then u G L°°{U,V) and the series 03.11) converges uncondition¬ 

ally towards u in L°°{U, V) and the estimate 03.5p holds. 


Proof: The summability of (||M[/||y)i/GJ' ensures that 03.Ih converges to a limit in L°°{U,V) 
and in turn in Lf{U, V, p). On the other hand, we know from Theorem 13.31 that it converges 
toward u G L‘^{U, V, p). Therefore, its limit in L°°{U, V) is also u. □ 


If the expansion 03.11) converges unconditionally towards u in some given norm || • ||, then 
a crucial issue is the choice of sets A„ that we decide to use to truncate 03.11) and define an 
n-term approximation. Since n measures the complexity of this approximation, we would 
like to hnd the set An which minimizes the truncation error in some given norm || ■ || among 
all sets of cardinality n, i.e. 


An '■= argmin 



#(A)= n , 


(3.10) 
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provided that such a set exists. This is an instance of best n-term approximation, which itself 
is an instance of nonlinear approximation. We refer to [21] for a general survey on nonlinear 
approximation. 

In the case where the error is measured in L‘^{U, V, p), and if is an orthonormal 

basis of L‘^{U,p) and {uy)y^j: are the coefficients of u in this basis, fl3.4p shows that the 
optimal is the set of indices corresponding to the n largest ||ni/||y. Note that such a set 
is not necessarily unique, in which case any realization of is optimal. 

In the case where the error is measured in L°°{U, V), there is generally no simple descrip¬ 
tion of the optimal set A„. However, when the functions are normalized in L°°{U), the 
right-hand side in the estimate fl3.5p provides a bound for the error of n term approximation. 
This upper bound is minimized by again dehning A„ as the set of indices corresponding to 
the n largest ||u,y||y. The only difference is that the error is measured by the tail of the 
sequence {\\uy\\v)u&Ti in contrast to the tail which appears in fl3.4p . Let us emphasize that 
this procedure gives only a bound for the error of best n term approximation in L°°{U, V) 
but is not guaranteed to be the best error. 

There is a good understanding of the properties of a given sequence (ciy)ygj- of real or 
complex numbers, which ensure a certain rate of decay n~‘^ of its tail after one retains 
its n largest entries. The following result, originally due to Stechkin in the particular case 
q = 2, show that this rate of decay is related to the summability of the sequence for values 
of p smaller than q. 

Lemma 3.6 Let 0 < p < q < oo and let g P'{£F) he a sequence of positive numbers. 

Then, if An is a set of indices which corresponds to the n largest c^, one has 



1 1 


(3.11) 


s : = 


q P 


u^A, 


Proof: Let {ck)k>i be the decreasing rearrangement of the sequence (ci/)ygj-. From the 
dehnition of A„, we have 



(3.12) 


On the other hand, we also have 


fc>n+l 


rx+1 



(3.13) 


Combining both estimates gives 



(3.14) 


ufA, 


which is flATTD . 


□ 
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Combining the above result with either fl3.4p or fl3.5p shows that a suitable summability 
of the sequence is a sufficient condition to guarantee a convergence rate when 

retaining the terms corresponding to the n largest ||Mi/||y in fl3.ll) . For the L'^{U, V, /i) error, 
and when is an orthonormal basis of /i), we obtain the rate s = ^ | if 

p < 2. For the L°^{U,V) error, and when the (pi, are normalized in L°°{U), we obtain the 
rate s = - — lifp<l. 

Remark 3.7 Lemma \S. hi shows that summability implies that the tail of [cy)y^jr after 
retaining the largest n-terms decays with rate n~^ where s ^ It is actually possi¬ 

ble to exactly characterize the properties which governs this rate of decay through weaker 
summability properties. Let us recall that for 0 < p < oo the space consists of those 

seguences {cv)v^t of real or complex numbers such that for a finite constant C > 0, 

#{i/ : \cu\ >p}< T] > 0, (3.15) 

or eguivalently such that for a finite constant C > 0, 

Ck < fc > 1, (3.16) 

where {ck)k>i is the decreasing rearrangement of The guasi-norm \\{ci,)i,^jr\\wiv(r) 

can be defined as the smallest C for which either one of these inegualities holds. Then, for 
0 < p < q < OD one can check that the tail of {cif)y(zjr after retaining the largest n-terms 
decays with rate where s ^ ^ if and only if {ci,)ueT £ wP’{F), see ISTf . 


3.2 Convergence of n-term truncated polynomial expansions 

We now restrict our attention to polynomial series. This corresponds to particular choices 
of the functions (pv as polynomials. For the remainder of this section, we take IF to be the 
set of all sequences z/ = of non-negative integers which are finitely supported. For 

u G F', we use the notation 

||i/||o := #(supp(z/)) < cx), supp(i/) := {j > 1 : izj ^ 0}. (3.17) 


as well as 

W\ •= ll^lli = ^ 

i>i 

For any z = (zj) G C^, and n E F, we define 


n 

j>i 


z^ :=\\ z]\ 


(3.18) 


(3.19) 


We consider three type of polynomial series: 
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Taylor (or power) series of the form 




v^T 


where 

tu ■■= ^d‘'u{y = 0), 
with the convention that 0! = 1. 


(3.20) 


(3.21) 


Legendre series of the form 


^v^L^{y), L^(|/) = JJl^.(|/j), 

i>i 


(3.22) 




where {Lk)k>o is the sequence of Legendre polynomials on [—1,1] normalized with 
respect to the uniform measure, i.e. such that 


inwtf = 1. 


-1 


It follows that is an orthonormal basis of L‘^{U,fi), with 


y = ®j>i 


2 ’ 


the uniform measure over U. The coefficients Vy are therefore given by 


(3.23) 


(3.24) 


Vv := J u{y)Ly{y)dy,{y). 
u 


(3.25) 


Renormalized Legendre series of the form 


WyPy (y) , Py (?/) = jj Py^ (|/j) , 

i>i 




(3.26) 


where {Pk)k>o is the sequence of Legendre polynomials on [—1,1] with the standard 
normalization 

||Pfc||L°“([-i,i]) = Pfc(O) = 1. (3.27) 

One has Lk = \/l + 2kPk, and therefore the coefficients Wy are given by 


Wy := (^]^(1 + 2z/. 
i>i 


1/2 

JJ I '^I'l 


(3.28) 


where Vy is defined by fl3.25p . 





In the case of the Taylor series, the following result shows that the assumptions in The¬ 
orem [52] ensure the conditional convergence of fl3.20p towards u in L°°{U, V). 

Theorem 3.8 Consider a parametric problem of the form (11.11) such that Assumption 
A holds for a suitable affine representer {'4’j)j>i- If the assumptions of Theorem \2.8\ are 
satisfied, then the Taylor expansion fl3.20p converges conditionally towards u in L°°{U,V). 

Proof: Under the assumptions of Theorem 12.81 the Frechet derivative of the solution map 
a e-)■ u{a) is uniformly bounded over a(ld) and therefore 

M := max \\du{a)\\c{xy) < oo. (3.29) 

aGa{U) 

From the assumption that (||'0illv)j>i G for any n > 1, there exists J = J{n) be such 

that 

Z ll^.ll- S (3.30) 


Increasing the value of J decreases the left side, so we may assume that J{n) > n. 

We know from Theorem 12.81 that the map y u{y) admits a holomorphic extension 
z I—)■ u{z) to domains Up that contain U. For any z = G U, we dehne its truncation 



o' 

0 "' 

(3.31) 

and the map 

J 


v{z) := 

u{Tjz) = u{a{Tjz)) = u(a + • 

1-1 

(3.32) 

Since, for 2 ; G W, we have 

1 a(z) 

1 a{Tjz)\\x< - 2nM’ 

j>J+i 

(3.33) 

it follows from 113.29p that 


\\u-v\\loo(^u,v) < 

(3.34) 

Now, we can write 


v{z) = w{zi,.. .,Zj), 

(3.35) 


where the hnite dimensional map w is bounded and holomorphic in each variable Zj on an 
open neighborhood of the unit polydisc Uj := ®'j=\{\zj\ < 1}. It follows that w has a Taylor 
expansion that converges on Uj. Its Taylor coefficients are given by the fj, for all v of the 
form (i^i,..., z/j, 0, 0,...). Therefore, there exists K = K{n) > n such that for 

A„ := {z/ G : supp(z/) C {1,..., J} and |z^| < A}, (3.36) 


41 
















one has 


^ 1 
V ~ 2n 


(3.37) 


and therefore 


sup 

z&A 


v{z) - ^ ti.z'' 

ueAn 


sup 

Z&U 


\u{z) - t^z'^Wv < 

n 




(3.38) 


Since both K{n) and J{n) tend to inhnity with n, the sequence of sets (A„)„>o is a exhaus¬ 
tion of We have thus proved the conditional convergence of (13.201) towards u in L°°{U, 17), 
and in turn in L°^{U, V). □ 


We are now in position to state our main result which gives simple conditions that 
guarantee the summability, 0 < p < 1, of the sequence {\\uy\\v)v^T) where Uy is either 
fy, Vy or Wy. These conditions are expressed in terms of the summability of the sequence 
(||'0j||x)j>i for the same value of p, and the assumptions in Theorem 12.81 in the case of or 
in Theorem 12.91 in the case of v„ or w^. 


Theorem 3.9 Consider a parametric problem of the form fll.ll) such that Assumption A 
holds for a suitable affine representer {'ipj)j>i- Then, the following summability results hold: 

(i) If the assumptions of Theorem \2.8\ are satisfied, and if in addition (||'0j||x)j>i G -^^(N) 

for some p < I, then (||ti/|lv)i/Gj:- G for the same value of p. 

(ii) If the assumptions of Theorem \2.9\ are satisfied, and if in addition (||'0j||x)j>i G -^^(M) 

for some p < 1, then (||uy||y)ygj- G and (||t(;iy||y)i,gjr g for the same value 

ofp. 

The proof of this result is given in ^3.71 For now, we use this theorem together with the 
previous results of this section to obtain corollaries on the rate of convergence of n-terms 
approximations obtained by truncation of Taylor or Legendre series. 


Corollary 3.10 Consider a parametric problem of the form fll.ip such that Assumption 
A holds for a suitable affine representer If the assumptions of Theorem \2.8\ are 

satisfied, and if in addition (||'0illx:)j>i ^ -^^(N) for some p < 1, then the Taylor series 
fl3.2Up converges unconditionally towards u in L°°{U, V). Moreover, for any set A„ of indices 
corresponding to n largest o/ ||C||y, we have 


sup 

yec/ 


An 


< C(n + l)-^ 



1 , 


(3.39) 


where C := ||(||d|y)j/eJ-||£p < oo- 
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Proof: Using the first part of Theorem 13.91 we are ensured that (||ti/||F)i/GJ' ^ Since, 

by Theorem 13.81 the series fl3.2Up converges conditionally, by application of Theorem 13.41 we 
find that it also converges unconditionally with the error bound 


sup 

yeu 




< X] imw- 


We now use fl3.1ip with = \\t^\\v and g = 1 to obtain the error bound 03.391) . 


(3.40) 

□ 


Corollary 3.11 Consider a parametric problem of the form 01.ip such that Assumption 
A holds for a suitable affine representer {'4’j)j>i- If the assumptions of Theorem \2.9\ are 
satisfied, and if in addition (||'0j||x)i>i £ for some p < 1, then the Legendre series 

03.22P and 03.26P converges unconditionally towards u in L°°{U, V) and in L‘^{U, V, ju) where 
H is the uniform probability measure. In addition, we have the following error bounds: 


If An is the set of indices that corresponds to the n largest Hvi/Hv, we have 

^ ^ VuLp 


u 


i^GA„ 


<C(n + l)-^ s = l-l 

L^{uy,fi) p 2 


(3.41) 


where C := ||(|lw||v)i/eJ'lkp < oo. 


If An is the set of indices that corresponds to the n largest ||t(;y||y, we have 


u 


WyPv 




L^(uy) 


<C{n + l) s = -1, 


P 


(3.42) 


where C := || (||wi/|1v)i/gjg|1£p < oo. 


Proof: Using the second part of Theorem l3.91 we are ensured that (||ni.||y)i/gj- and (||tCi/||y)ygj- 
belong to The unconditional convergence claims in the theorem are ensured by The¬ 

orems 13.31 and 13.51 These latter two theorems also give the estimates 




Eii^ 

lid'''" 

I'WVJ 5 

(3.43) 







and 







\\u - y WnPu 


= E 

\\Wy\\v. 

(3.44) 


V^An 





The application 

of 03.IIP with Cy = 11 ^ 1,11 

y and q = 

-- 2, or 

with 1 tCy||y and q = 

1 , give the 

error bounds fpT 

Ti} and 03.42P. 




□ 

Remark 3.12 

Note that since we have 






VyLy = 

WyPy, V 



(3.45) 


the terms in the series 03.22p and 03.26P are actually identical. However the sets A„ defined 
by the n largest Hn^Hy or the n largest which are used to define the truncations for 

If or L°° estimates in the previous result, generally differ from each other. 
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The above corollaries show the curse of dimensionality can be broken for relevant class 
of parametric PDEs: although the solution map y ^ u{y) has inhnitely many variables, it 
can be approximated in various norms with an algebraic rate n~^, where n is the number of 
term in the separable expansion. The exponent s can be large if (||V'j|lx)j>i G for a 

small value of p. Several critical ingredients have been used in order to reach this conclusion: 

• The holomorphic extension of the solution map a ^ u{a). 

• The anistrotropy of the solution map with respect to the different variables yj. 

• The use of best u-term polynomial approximations. 

The fact that anisotropic smoothness may allow certain numerical methods to break the 
curse of dimensionality, in the sense that approximation results are immune to the growth 
in the number of variables, has also been studied in information based complexity, using 
certain weight sequences in oder to quantify anisotropy, see [6^ . 

Remark 3.13 Theorem \3.S\ and its corollaries can he formulated for a general map u from 
to V that is not necessarily the solution map of a parametric PDE, since as observed in 
B.emark \2.12\ Theorem,s \2. and \2 . .91 hold in this more general framework. 

3.3 Estimates of Taylor coefficients 

In this section, as well as the two that follow, we establish upper estimates for the I/-norms 
of the Taylor coefficients t^ and Legendre coefficients Vi, and w,,, which are instrumental 
in the proof of Theorem 13.91 These estimates are derived from the results on holomorphic 
extensions of the map y e-?> u{y) established in Theorems 12.81 and 12.91 Namely, by an 
application of the Cauchy integral formula in the different complex variables Zj. 

We recall that the Cauchy formula states that if is a function from C to a Banach 
space V which is holomorphic on a simply connected open set O C C and if T is a closed 
rectifiable curve contained in O, then for any 5 contained in the bounded domain delimited 
by T, 

r 

where the fraction in the integrand stands for the scalar multiplication of g:>{z) G V by 
{z — z)~^ G C and the curve T is positively oriented in the integral, see for instance Theorem 
2.1.2 of [19]. 

We begin with the estimates on Taylor coefficients which are based on the bounded 
holomorphic extensions onto polydiscs Up that were established in Theorem 12.81 

Lemma 3.14 Consider a parametric problem of the form fll.ip such that Assumption A 
holds for a suitable affine representer If the assumptions of Theorem \2.S{ are satisfied, 

then there exists an e > 0 and a C > 0 such that the estimates 

\\U\\v<Cp-^ = Cl[pp, uEJ^, (3.47) 

i>i 
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(3.48) 


hold for any sequence p = {pj)j>o of numbers larger than or equal to 1 for which 

- l)llV'ilU < £• 

j>i 

Proof: Let e > 0 and C be as in Theorem 12.81 and let p = {pj)j>i of numbers larger than or 
equal to 1 satisfying the constraint fl3.48p . For each j > 1, let Op. C C be the open set that 
contains the disc {\zj\ < pj} given in Theorem 12.81 Then, we know that the map y u{y) 
admits an extension z m(z) onto the set Op dehned by (12.1 OOp . which is holomorphic in 
each variable Zj with uniform bound 

sup ||m(z)||\/ < C. (3.49) 

z^(D p 

For any given u E T, we dehne 


J := J(z/) := max{j : Uj ^ 0}. (3.50) 

Similar to fl3.35p . we introduce the hnite dimensional function w dehned by 

w{zx,...,zj)=u{fTjz), Tjz = (zi,... ,zj,0,0,...), (3.51) 

so that we have for this particular i/, 

P-52) 

We know that w is holomorphic on the set 

Opp := Opj^ (3.53) 

which is an open neighborhood of the J-dimensional polydisc 


In addition, we have 


hlp,J ®l<j<J hipj. 
sup \\w{zi,..., Zj)\\v < C 


(3.54) 

(3.55) 


We may thus apply the Cauchy formula (13.461) recursively in each variable Zj and obtain for 
any (5i,..., zj) in the interior of Upp a representation of w{zi ,..., zj) as a multiple integral 

w{zi,...,zj) = {2'Ki)--^ f ... f w{zi ... zj) - (izi...dzj. (3.56) 

J J [Zi- Zi)... [Zj - Zj) 

hil=pi hj|=pj 

By differentiation, this yields 

= j ... j ^^ll222^dzi...dzj, (3.57) 

hi|=pi hj|=pj 
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(3.58) 


and therefore, using fl3.55p . we obtain the estimate 

3<J 

which is equivalent to (I3.47p . □ 

Let us comment on the estimate fl3.47p . Since we may take any sequence p on the right- 
hand side, as long as it satishes the constraint fl3.48p . we have the estimate 

||^!/||y < C'min ; "^{pj - l)\\'ijjj\\x < £ and pj > 1, j > . (3.59) 

i>i 

It is possible to characterize the sequence p* for which the minimum in the above right-hand 
side is attained. An important observation is that this minimizing sequence depends on v. 

To hnd p*, we observe that for a given v, this minimization problem is in fact hnite 
dimensional since p~'^ is not influenced by the values of pj for those j such that Vj = 0. 
Since p~'^ is monotone non-increasing with pj for the other values of j, and in view of the 
constraint 03.481) . we should thus set 

p* = 1, j ^ supp(z/). (3.60) 

It remains to solve the hnite dimensional problem 

min { n pp ■ E {pj - l)||V'jlU < e and Pj > 1, j e supp(i/)|, (3.61) 

iGsupp(i/) jGsupp(i/) 

or equivalently 

max { E log(Pi) : ~ l)llV'ilU < £ and Pj > 1 , j e supp(i/)|, ( 3 . 62 ) 

iesupp(!^) jGsupp(i/) 

which admits a unique solution since we minimize a strictly concave function over a convex 
set. The solution necessarily satishes the equality constraint 

E (p]-1)II'A)IU=£' (3'63) 

iesupp(j/) 

For the optimal solution p*, if £' C supp(i/) is the subset of those j G supp(z/) such that 
p* > 1, there exists a Lagrange multiplier A G M such that 

^ = A||^,|U, jeE. (3.64) 

Pj 

For any index v = {vj)j>i G E and E gN, we use the notation 

■= {Pj)j>i, Pj = if j ^ E, Vj = 0 otherwise. (3.65) 


l|5M0)lk = 




w 


... dzy 


( 0 ,..., 0 ) 
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Combining fl6.4p and fl3.63p . we thus find that 


A = 


\^e\ 


where ag := llV^jlU- 

jeE 


(Te + £ 

Therefore, the solution p* = p*{y) = {p])j>i, has the form 

Ej{crE + £) 


Pi = 


\^e\ \\t 


’j\\x 


iijeE, p* = liij^E. 


(3.66) 


(3.67) 


This characterization is not satisfactory since the set E is not explicitly given. However, 
given any set E, we can define A by (I3.66p and define a coresponding sequence (pj) as in 
fl3.67p . Therefore, the minimum we seek is the same as 


min 

E 



1 


(3.68) 


over all sets E C supp(i^) for which the corresponding pj given as in fl3.67p are strictly larger 
than 1 for all j G E. The optimal set E is the one for which this minimum is reached. 
This is a combinatorial problem which is not easy to solve except for those z/ G of small 
support. For this reason, we do not make further use of the above optimal sequence p*^^) 
in bounding ||ti/||v. Instead, we use in ^3.61 certain suboptimal choices p{v) which have an 
explicit expression inspired by 03.671) . 


3.4 Refined estimates for elliptic and parabolic PDEs 

The estimate 03.471) can be refined in the particular case of the elliptic and parabolic problems 
01.5p and 02.401) . We recall that for each of these problems, the parameter a is taken in 

X = L°°(T>), (3.69) 

and that the uniform boundedness and holomorphy of the solution map is ensured under a 
condition of the form 3fJ(a) > t for some f > 0. In such a case, we have seen in §2.51 that 
when the sequence p = {pj)j>i fulfills the constraints 

'^^Pj\'ijjj{x)\ <a{x) —t, X E D, (3.70) 

i>i 

for some f > 0, the holomorphic extension is defined over the polydisc Up with uniform bound 

sup ||M( 2 ;)||y < Ci. (3.71) 

z^lAp 

By a recursive application of Cauchy’s formula, as in the proof of Lemma 13.141 we now 
obtain, for any fixed t > 0, the estimate 

\\tu\\v < Ct min pj\'iljj{x)\ <a{x) - t, x G iA|. (3.72) 

i>i 
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It is not clear how to give a simple characterization of the above minimization problem, due 
to the form of the constraints fl3.7UI) which need to be fullhlled for every x E D. There are 
however two particular instances of affine decompositions where such a simple characteriza¬ 
tions exists. 

The first of these is when the ipj have disjoint supports, by which we mean that 

Isupp(V'i) n supp(V'j)| =0, i 7^ j. (3.73) 

In this case, the uniform ellipticity assumption UEA(r) holds if and only if 

\'ij>j{x)\ < a{x) — r, X E D, j > 1. (3.74) 


This instance is sometimes referred to as the model of disjoint inclusions. One particular 
example is when a is piecewise constant over a finite or infinite partition of D, in 

which case a is a strictly positive constant and ijj = CjXd for some positive numbers Cj 
each of them smaller than a — r. 

In the general case of disjoint supports of the ijj, the constraint fl3.70p can be decoupled, 
so that the minimization problem on the right-hand side of fl3.72l) is equivalent to 

minlp”^' : pj\'il!j{x)\ <a{x) — t, x E D, j > l|. (3.75) 

The optimal solution p* to this problem is obviously given by 

(3. 

Let us note that in that case p* does not depend on v. This leads us to the estimate 



p* = inf 

x&D 


a{x) — t 


\i}j{x)\ 


\\tu\\v — n 
i>i 


\'lpj[x)\ 

sup - 

xGD OyXj 


(3.77) 


If UEA(r) holds, we see that we can take each p* strictly larger than 1 if we take 0 < f < r, 
for example by setting t = |. In such a case, we have indeed 


aix) — t a(x) — t ^ r — t ^ r 

- >1-1- =- > 1 H-=—, 

|'0j(a;)| a{x) — r a{x) — r 2||a||x’ 


(3.78) 


which shows that p* > 1. Note that the values p* increase as t decrease, which in principle 
results in a better bound for ||ti/||v- However the constant Ct tends to -|-cxo as f —)■ 0. One 
may in principle search for an optimal value of t, however we do not enter this discussion. 

The second instance is when the ipj are functions of constant moduli, such as complex 
exponentials. In this case, the uniform ellipticity assumption UEA(r) holds if and only if 


llV'ilU < Omin - Omin := mina(x), (3.79) 

x£D 

i>i 
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and the minimization problem on the right-hand side of fl3.72p is eqnivalent to 

minlp-"^ : y^^pjWjjjWx < Qmin - 


(3.80) 


i>i 


By the same Lagrange multiplier approach which we used above for the characterization of 
the minimizer in fl3.59p . we hnd that the above minimum is attained for p* = p*{v) = {p])j>i 
given by 

Z/jIdjjiin t) 

Pj = 


\jy 


’j\\x 


(3.81) 


This leads us to the estimate 


< Ct 




1^1 J-J 1^11 


i>i ^ 


(3.82) 


3.5 Estimates of Legendre coefficients 

Returning to general parametric PDEs of the form (11.11) with an affine representation (ll.lOp . 
our next objective is to establish similar estimates for the Legendre coefficients ||njy||v and 
||tCi,||y. Let us recall that these coefficients are given by 


V 


U 


j u{y)L^{y)dp{y), 
u 


(3.83) 


and 


Wy 


+ 1) / u{y)Py{y)dp{y), 

i>i 


They are linked by the relation 


(3.84) 


Wy = 


1 y/2 

(n(l + 2n^)) h 

j>i 


(3.85) 


We introduce the function 


11 -^ 9(t) 


irt 

2((-l)’ 


(3.86) 


which is monotone non-increasing over ]1, -|-cxo[. 

The following result establishes estimates on the Legendre coefficients, based on the 
bounded holomorphic extension of u onto the polyellipses Tip and their neighborhood Op 
established in Theorem 12.91 


Lemma 3.15 Consider a parametric problem of the form fll.ll) such that Assumption A 
holds for a suitable affine representer {'ipj)j>i- If the assumptions of Theorem \2.9\ are satisfied, 
then there exists e > 0 and C > 0 such that the estimates 

\\vu\\v <C II e{pj){l + 2u,f^pp, (3.87) 

iesupp(!^) 
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and 


\Mv<C J] 0(p,)(l + 2 z/,)p/^ (3.88) 

iGsupp(i/) 

hold for any sequence p = (pj)j>o of numbers strictly larger than 1, which satisfies the 
constraint fl3.48l) . 

Proof: We only need to prove (13.871) . since (13.881) then follows by (13.851) . Let e > 0 and C 
be as in Theorem 12.91 and let p = {pj)j>i be a sequence of numbers strictly larger than 1, 
which satishes the constraint (I3.48p . We know that for each j > 1 there exists an open set 
Op. C C that contains the hlled-in ellipse l-ip. and such that the map y u{y) admits an 
extension z u{z) over the set Op dehned by (I2.100p . which is holomorphic in each variable 
Zj with uniform bound 

sup ||M(z)||y < C. (3.89) 

zeOp 

We observe that U (Z Op. 

In the case z/ = 0, the estimate (I3.87P is immediate since 


Ikolk 


j u{y)dp{y) 
u 


< sup \\u{y)\\v < c, 

yeu 


(3.90) 


where we have used the fact that /i is a probability measure. We now assume that u ^ 0. Up 
to a reordering of {'ipj)j>i, we may assume without loss of generality that Uj 7 ^ 0 for j < J 
and Uj = 0 for j > J for J = |supp(z/)| > 1. We partition the variable y into 


2/ = (2/i,---,2/j,2/')> y’■= {yj+uyj+ 2 ,---) = u, (3.9i) 


and rewrite (13.841) as 


Wy = JJ( 2 z/j + 1 ) / v{y')dp{y'), 
i=i i 


where 


v{y') ; = 


: f u{yu...,yj,y') (YlPy^iyj)^^ 


(3.92) 


(3.93) 


For a hxed y' G U, we use the holomorphy of the hnite dimensional map (^i,... ,zj) e->■ 
u{zi,... ,zj, y') in order to evaluate \\v{y') ||y. For this purpose, we introduce for any integer 
n > 1 the following function of a single complex variable .2 


1 



-1 


(3.94) 
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and the corresponding multivariate functions 


■ ■ ■) ^j ) • Q vj ) ) 

i=i 


(3.95) 


which are well dehned as long as \zj\ > 1 for j = 1,..., J. For our given p, we introduce the 
J-dimensional polyellipse 

£p,j ■= ®i<j<J^pj- (3.96) 

Since Pj > 1, 1 < j < J, the unit interval [—1,1] is contained in the interior of each £lled-in 
ellipse 'Hpj. Therefore, we may recursively apply Cauchy’s integral formula on each ellipse 
Sp^ for each of the variables , j = 1,..., J , and obtain 


u{yi, ■ ■ -^yj^y') = 


(27ri) 


^pi £pj 


u{zi,...,zj,y') 

(pi - zi)... {yj - zj) 


dzi... dzj, (3.97) 


for any (pi,..., yj) G [—1,1]'^ and any y' G U. Multiplying by nj=i integrating 

over [—1,1]"^ with respect to ^ ... we therefore obtain 


v{y') = 


dvr 


u{zi, ...,zj, y')Qy{zi ,..., zj)dzi... dzj. (3.98) 


^pi ^pj 


From the uniform bound fl3.89p we know that 

(zi,..., zj) G 8p^j and y' E U ^ (zi, ...,zj, y') EOp^ \\u{zi, ...,zj, y') ||y < C. (3.99) 
Injecting this bound in the above integral yields 


'(!/')iiv<c(n 

i=i 


max \Q^{zi,...,zj)\, y'e U, 

Z / (zi,...,zj)GSp_j 


(3.100) 


where we have used the fact the perimeter of Sp^ has length smaller than 27rpj. We now use 
the following estimate (see page 313 of 


which yields 


and therefore 


TT t ^ 

max|Q„( 2 ;)| < -— 

Z^tt t — I 


J -yj 

n il p ■ 

-y, 

. Pj — 1 

j=l y 


'(!/')iiv<cn«(ft)pr. 

i=i 


(3.101) 


(3.102) 


(3.103) 
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Combining this estimate with fl3.92p . we obtain 


j 

\\wAv < JJ(1 + 2z/j) sup ||u(2/')||y <C JJ G{pj){^ + (3.104) 

i=l ^ jGsupp(i^) 

which is (13:871) . □ 

The estimates fl3.88p and (13.871) are very similar to the estimate (13.471) obtained in Lemma 
13.141 for the Taylor coefficients, however with two noticable differences: 

• On the one hand, the estimates for the Legendre coefficients are a bit more pessimistic, 
due to the presence of the additional factors 0{pj) and {l+2uj). Intuitively, these factors 
are absorbed by the decay of the factor p- when pj or Vj become large. The analysis 
in the next section conhrms that they do not affect the summability properties of 
the estimate. 

• On the other hand, these estimates are obtained under much weaker conditions than 
those of Lemma [3.141 Indeed Theorem 12.91 only requires the existence of a holomorphic 
extension of the solution map a u{a) in a neigborhood of a(17), in contrast to 
Theorem 12.81 which requires a neighborhood of a{U). In particular, for problems such 
as (I2.58P or (12.661) . we know that the conditions of Theorem 12.91 are met but not those 
of Theorem 12.81 

Similar to the estimate for Taylor coefficients, we can use the fact that (I3.88P and (13.871) 
hold for any sequence p satisfying the prescribed constraints, in order to obtain the estimates 

||uj.||y < Cinf I + (3.105) 

iesupp(!^) 

and 

\\wy\\v < C'inf I 0{pj){l + 2vj)p~''^^, (3.106) 

iesupp(j/) 

where the inhma are taken over all sequences p of numbers strictly larger than 1, such that 
- l)llV'j||x <£• 

Remark 3.16 The values of pj enter the above estimates only for j G supp(z/). This implies 
that we can consider the above infimas over all sequences p of numbers larger or equal to 1 
with Pj > 1 z/j G supp(y) and such that X]jgsupp(y) iPj ~ which amounts in 

taking pj = 1 if j ^ supp(z/). 

3.6 Summability of multi-indexed sequences 

We want to use the upper estimates obtained for ||tj/||y, ||'yj/||y and ||tny||y derived in the 
previous sections in order to prove Theorem 13.91 As a preliminary step, we establish in 
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this section several results concerning the (P summability of certain type of multi-indexed 
sequences, which appear in the proof of Theorem 13.91 that follows. 

We begin by considering sequences of the form where b = is a given 

sequence of positive numbers. For such sequences we have the following elementary result. 

Lemma 3.17 For any 0 < p < oo, the sequence {b’^)ueT belongs to if and only if 

b G and ||&||£°o < 1. Moreover 

< exp(^Cp^^), Cp := (3.107) 


Proof: For any positive integer J, let iFj denote the set of those u E iF such that supp(z/) C 
{1,..., J}. Now, if < 1, we can write 

n E'’r= n ^ = 1.2..... (3.108) 

v^Tj n >0 ^ 

If 6 e we can let J tend to -|-cxo and obtain 

E= n rrv < 

udF j>l i 

This proves the one implication of the theorem. Since, 

nT:^ = n(^ + T^) -n'^^p(T^) <YleMcpb^j) = (S.no) 

t>i ^ i>i J i>i ^ j>i 

we also have the bound fl3.107p . 

For the other implication, we observe that the sequences {bj)j>i and (&”)n>o for any 
j > 1, are subsequences of corresponding to particular selections of indices z/. This 

shows that the P’ summability of {lF)y^j: implies both that b G and ||fe||£°o <1. □ 


One immediate application of the above lemma concerns the P summability of the Taylor 
coefficients for the elliptic or parabolic problems in the model of disjoint inclusions discussed 
in ^3.41 In this case, the estimate fl3.77p has the form 


ll^i^llv E Ctb'', where b = with bj := sup 




xGD Cl\X j t 


(3.111) 


Working under UEA(r) and taking t = |, we know from fl3.78l) that for X := L°^{D), 


I / 2||a||x ^ ^ 

\£°° — — _ _ \ 1. 


2||a||x + r 


(3.112) 
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Since in addition 


(3.113) 


bj < 


nt 


’j\\x 


this shows that (||'0illw)i>i G implies b G Combining these observations with 

Lemma 13. 171 we thus find that if UEA(r) holds and if (||t/^j|lj\:)j>i ^ then the sequence 

(||^i/||belongs to which is a particular case of Theorem 13.91 


Remark 3.18 We have mentioned in Remark \d.l\ that the convergence rate n~^ of best n- 
term approximation in spaces is equivalent to the property of weak P’ summability with 
— -. Therefore, a relevant question is whether the above Lemma \3.11\ is valid with P 


s = 


replaced by wP. Surprisingly, the answer is negative, and closely related to classical results 
in number theory. Indeed, fix any 0 < p < 1 and consider the prototype sequence b G wP(N) 
given by 


6, = (j + l)-i/C (3.114) 

This sequence also satisfies ||&|l£°o <1. If we were to have G wP{IF) then there would 

he a constant C such that for any rj > 0, we have 


: P>ri}< 


or equivalently, such that for any A > 2, 

t(A) := e JT : 1 [P<a}<CA. 

i>2 


(3.115) 


(3.116) 


The left side can be rewritten as 

IA\ 

= (3.117) 

n=2 

where f{n) is the number of possible multiplicative partitions ofn. The problem of counting 
multiplicative partitions of natural numbers, sometimes refered to as factorisatio numerorum, 
has been extensively studied in number theory, see in particular m which gives a sharp 
asymptotic bound for f{n). In fERf, it is proved that the total number of multiplicative 
partitions t{A) has the asymptotic behaviour 


A 


~ exp 


4^1og(7l) 
\/^ log (log (A)) 


(1 + 0 ( 1 )) 


—> +CXO 


(3.118) 


as A —>■ + 00 . This shows that fl3.116p does not hold, and thus that {b’^)ueT does not belong 
to wP{IF). 


We make further use of a slightly more general version of Lemma 13.171 where we incor¬ 
porate additional algebraic factors into the sequence P. 
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Lemma 3.19 For a given sequence b = {bj)j>i of positive numbers, and for non-negative 
numbers c and r, let {by)y(zjr he defined by 

6, ;= 6" l[il + cz/}) = n(l + (3-119) 

j>i j>i 

For any 0 < p < oo, this sequence belongs to if and only if b E and ||&||£°o < 1. 

Proof: Since b^ > b'', the “only if” part follows from Lemma 13.171 and therefore we only 
need to prove the if part. With iFj as in the proof of Lemma 13.171 we write 

E = n E(1 + (3-120) 

v£Tj n>0 

Since ||fe||£°o < 1 we hnd that 

+ <^ + Clfj, (3.121) 

n>0 

where the constant C depends on c, r, p and ||&|l^°o. Since b G £^(N), this shows that the 
product on the right side of (13.1201) converges as J —)■ cxo. Therefore (fci/)ygjr G □ 

The estimates obtained for ||tj/||y, 11^1/|lv and ||tCi/||v also involve quantities of the form 

J (3.122) 

n^i"/ 

for sequences d = {dj)j>i of positive numbers. In view of the Stirling inequalities 

n! < n” < n!e", (3.123) 

we may write 

Iz/| I I Iz/O Iz/O 

J ' (3.124) 

nj>i 

where 

b = {bj)j>o, bj = edj. (3.125) 

This suggest studying the summability of sequences of the form . Due to the 

presence that the multinomial factor which can be much larger than 1, we expect that 
the conditions for summability are more stringent than for the sequence {b^)u^r- This is 
conhrmed by the following result. 

Lemma 3.20 For any Q < p < 1, a sequence belongs to £^(J^) if and only if 

b G £^(N) and ||&||^i < 1. 
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Proof: We first observe that whenever b G the multinomial formula gives 


\u\=k j>l 


(3.126) 


Summing over k we see that 
Moreover, 






is in if and only if 6 G and ||&||£1 (n) < 1- 


v\ 










1 - 




(3.127) 


Now suppose that () G for some p < 1. Then, b is in £p(N) since it is a 

subsequence of b corresponding to a particular selection of indices u. Also b is in so b 

must be in with norm smaller than one. 

Conversely, assume that b G £p(N) and < 1. We claim that there exists two positive 
sequences c = {cj)j>i and d = {dj)j>i with the following properties: 

(i) bj = Cjdj for all j > 1. 

(ii) c G £^(N) with ||c||p < 1. 

(hi) d G £'^{'N) with - = ^ — 1, or equivalently q = and \\d\\e<^ < 1. 


Before proving this claim, let us show that it implies the summability of ( -^b'' 
Indeed, from Holder’s inequality, we have 




E 


z/ n 

1a^ 


E 




Ul 

1/&T u&T 


dP'' 


1-p 


As observed previously in (13.1271) . the hrst factor is hnite due to the fact that ||c||£i < 1. 
The second factor is hnite by application of Lemma 13.171 

It remains to prove the claim by constructing specihc sequences c and d having the 
prescribed properties. With <5 := 1 — H&H^i > 0, we dehne 


and take J large enough that 


We then dehne c and d by 



yK<l 

j - 3 

j>J 


(3.128) 

(3.129) 


Cj = (1 + tfjbj and dj 


1 + p’ 


3 < A 


(3.130) 
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and 


(3.131) 


Cj = Ifj and dj = fe] j > J. 

By construction, we have cjdj = bj for all j > 1- For the sequence c, we have 

||c||£i < (1 + h)ll^ll^i + ^2 ^ 3 ) ~ 3 - ^ ~ 3 ’ (3.132) 

j>J 

We next bound ||(i||£o°. For 1 < j < J, we have dj = < 1 and for j > J, we have 

= (3,133) 

Therefore, we have ||d||^°o < 1. Finally, since d^j = for j > J, we hnd that d G £'^(N), 
which completes the confirmation of the claim. □ 


Similar to Lemma 13.191 the following result shows that summability is maintained if 
we incorporate additional algebraic factors. 


Lemma 3.21 For a given sequence b = {bj)j>i of positive numbers, and for non-negative 
numbers c and r, let {by)y(zj: he defined by 

b,,:=^-^b‘'l[il + cui;). (3.134) 

t>i 

For any 0 < p < 1, this sequence belongs to £P{iF) if and only ifb^ f'P(M) and H&H^i < 1. 

Proof: Since fej, > -^6^, the “only if” part follows from Lemma 13.201 and we only need to 
prove the if part. 

Using the same sequences c and d as in the proof of Lemma [3.201 and introducing 

d,. = d'^Y[{l + cn^.), (3.135) 

t>i 


we write 




< 




i-p 






and conclude in a similar manner that both factors are hnite, using Lemma 13.191 for the 
second factor. □ 
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3.7 Proof of Theorem 13.9 

In order to prove Theorem 13.91 we use the estimates (13.471) . (13.881) and (13.871) for the ||ti/||v, 
||n,y|l\/ and respectively. The right-side of these estimates has a general form C'r(z/, p) 

for any sequence p of numbers larger than 1 that satisfy the constraint fl3.48l) . Our objective 
is to build for each v such a sequence p = p{v)-, and show that, for 0 < p < 1 the resulting 
quantities 

Tj, := r(z/,p(z/)), (3.136) 

are summable provided that (||V'illv)j>i G Obviously, it is sufficient to treat the 

case when 

r(v,p):= J] e(pj)(l + 2u,)p--\ (3.137) 

iGsupp(!^) 

which appears in the right of 03.871) . since it is the largest estimate. 

We £x an arbitrary v ^ T and describe our choice for the sequence p that we insert into 
the above expression. In what follows, we use the notation 

where 6^ := ||?/;j||x, j > 1- (3.138) 

For J > 1 to be fixed further, we split N into 

E := {1,..., J} and F := {J + 1, J + 2,...}, (3.139) 

and use the notation ve = iyi, • • •, and vp = (^j+i, ^j+ 2 ,...) G F. In view of 

Remark 13.161 we may take 

pj = l, j i supp(z/). (3.140) 

With £ the right side of the constraint 03.481) . we then take 


pj = K -.= I + 


2 || 6 ||„ 


, j e Ensupp(iy), 


and 


Pj = K + 




2bj I Up 


■, j e F nsupp(z/). 


Therefore pj > 1 when j G supp(z/), and in addition 




+ 


E( 


ebi 


+ 




j>J 


V2||6|l,i 2\up 


< e. 


(3.141) 

(3.142) 


(3.143) 


which shows that the constraint 03.481) is satished. 

When using this choice for the sequence p, the resulting estimate may be written 


Ti, = r(u,p(u)) = rE{i')rF{i'), 


(3.144) 
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where 


]^(1 + and rp{v) ■= 6{pj){l + 2Pj)p-''\ (3.145) 

j&E jeFnsupp(i/) 

Denoting by Tp and Tp the multi-indices in supported on E and F, respectively, we may 
then write 

J2rl = ^E^P, (3.146) 

u£E 

where 

T,e := ^ rpi^Y and T,p := ^ rp{uy, (3.147) 

v^Te v^Tf 

provided that both sums converge. 

The first sum is estimated by 

./ 

Ee = ^ n(l + 

i/gn-^ i=i 

= + 2n)Pfi:-P")'^ < cx), 

n>0 

For the second sum we first notice that for each v G Ff, 


rF{y)< n + 

jGFnsupp(i/) 


EVa 


2bj\i'p 




(3.148) 


where we have used the fact that 6{k) = ma.xt>K0{t) > 0{pj) for j G F. Therefore, with 
c := 39{k), we hnd that 


rp{u) <|z/F|l"^in 


(1 + cz/,) (|i) 


U- 

j^F 3 




Up'. 


2eb, 


jeF 


where we have used fl3.123p . Introducing the sequence d = (dj)j>i defined by 

2e6j+j 


dj = 


we thus find that 


Sf < where F := —| + c^i)- 

u&E j>l 


(3.149) 

(3.150) 
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We now choose J sufficiently large so that 

II 111 X—^ 2cbj . . 

\\dy = J2 <^- (3.151) 

j>J ^ 

Since our assumption b G implies that d G £^(N), we may apply Lemma 13.211 to 

conclude that SiT’ is hnite. The proof of Theorem 13.91 is complete. 

Remark 3.22 One defect in the proof the Theorem \3.tA is that, while it establishes the (T 
summability of the sequences and (||wy||y)ygj-, it does not provide 

us with a simple hound of the P’ norms of these sequences in terms of the P norm of the 
sequence {\\'il>j\\x)j>i- 

3.8 Approximation using downward closed sets 

Theorem l3.9l has implications on the rate convergence of polynomial approximations obtained 
by retaining the terms in Taylor and Legendre series corresponding the n largest coefficients. 
Corollaries 13.101 and 13 .11 1 show that these approximations converge with the rates n~^, where 
s = i — 1 for uniform convergence and ^ ^ ~ ^ for convergence in LP'iU, V, ju). 

These results should be viewed as a theoretical justihcation that reduced modeling meth¬ 
ods based on polynomial approximations may perform well for parametric PDEs which 
satisfy the assumptions of Theorem 13.91 However, they constitute, by no means, a numerical 
algorithm since Ending the optimal sets (A„)„>i are, in practice, out of reach, and so is the 
exact computation of the Taylor and Legendre coefficients. 

Practical algorithms for the computation of polynomial approximations are discussed 
later in this paper in §5 and §6. The implementation and analysis of the algorithms presented 
there benefit from imposing additional structure on the index sets used to define the 
polynomial approximation. To define this structure, we first recall that iF has a partial 
ordering: for z/, G we write < z/ if and only if hj < Vj for all j > 1. We also write 
z> < z/ if and only if < z/ and hj < Vj for at least one value of j. 

Definition 3.23 A set A G F is called downward closed or a lower set if and only if 

z/ G A and D < v implies G A. (3.152) 

When considering polynomial spaces 

Pa := span{|/ ^ y'^ : z/ G A}, (3.153) 

it is quite natural to make the assumption that A is a downward closed set. In particular, 
this assumption allows us to describe Pa in terms of any tensorized polynomial basis of the 
form 

= W(t>yi{.yj), (3.154) 

i>i 
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where {4)k)k>o is any family of univariate polynomials such that 0o = 1 and (pk has degree 
exactly k. This includes in particular the tensorized Legendre polynomials Ly. By expressing 
each monomial y ^ as a. linear combination of the (pi for 0 < / < fc, we find that Pa is 
equivalently defined by 

Pa := span{(/)y : u G A}, (3.155) 

Polynomial spaces associated to downward closed sets have been introduced in [59], in di¬ 
mension d = 2 and refered to as polynomes pleins. Later, these notions were studied in 
general dimension d, in [50] and [53] • Note that in dimension d = 1, a downward closed set 
is simply of the form {0,1,..., n}. 

The sets index sets A„ corresponding to the n largest ||ti/||v, Hwi/Hv or ||tCi/||v are gener¬ 
ally not downward closed sets. A legitimate question is therefore: does there exists nested 
sequences (A„)„>o of downward closed sets such that the truncated Taylor or Legendre series 
using such sets have the same convergence rates as those obtained in Corollaries 13.101 and 
13.111 using the n largest ||ti/||v, or ||tCj,||y? The results of the present section give a 

positive result to this question. 

Let us begin by observing that if a sequence {cy)y^jr of positive numbers is monotone 
non-increasing, that is, if 

U < 9 ^ Cy < Cy, (3.156) 

then the set A„ corresponding to the n largest values of Cy is downward closed, provided that 
it is unique. In case of non-uniqueness, there is at least one realization of such a set which 
is downward closed. In addition, there exists a sequence (A„)„>i of such realizations which 
is nested. Note that in such a realization, we necessarily have Aq = {0}. 

For an arbitrary sequence c = {cy)y^jr G we introduce its monotone majorant 

which is the sequence c = {cy)y^jr dehned by 

Cy:=sup|Cy|. (3.157) 

U>U 

This is the smallest monotone non-increasing sequence that dominates c. In order to study 
best n-term approximations using downward closed sets, we introduce the following sequence 
spaces. 

Definition 3.24 For 0 < p < oc, we say that a sequence c G belongs to if and 

only its monotone majorant c belongs to £P{jF) and we define 

\\^\\(-p- (3.158) 

Combining this definition with Lemma [3.61 shows that if0<p<g<cxo and if {cy)y^jr is 
a positive sequence which belongs to then one has the tail bound 

C=\\{cy)y^rhi,, S (3.159) 
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where A„ is any downward closed set of indices corresponding to the n largest terms of the 
monotone majorant c of c. We may therefore obtain the same rate as in Lemma ESI now 
using downward closed sets. 

We would therefore like to know under which circumstances the sequences (||ti.||y)i/GJ ='5 
(||'yj/||y)j/eJ' and (||tCi/||y)j- belong to The following result, originally proved in [1^ 

in the case of elliptic parametric PDEs and in [19] for other models, shows that this holds 
under the exact same assumptions as in Theorem 13.91 


Theorem 3.25 Consider a parametric problem of the form (II.ip such that Assumption A 
holds for a suitable affine representer Then, the following summability results hold: 

• If the assumptions of Theorem \2.8\ are satisfied, and if in addition (||'0ilU)j>i £ 

for some p < I, then (||ti/|lv)i/GjF G same value of p. 

• If the assumptions of Theorem\2.9[ are satisfied, and if in addition (W'lfj IU)j>l 6 <'(N) 

for somep < 1, then (||ni/||y)jygj- G (||'?ni/|l^ same value 

ofp- 


Proof: Similar to the proof of Theorem 13.91 we use the estimates fl3.47p . fl3.88p and fl3.87p 
for the ||ti/||v, ||n!/||v and ||tc,^||y. 

In the case of ||ti/||y, the estimate has the form 

\\tu\\u£F < ev'■= C ml, (3.160) 

where the infimum is taken over the set of sequences p of numbers larger than 1 that satisfy 
the constraint fl3.48p . Since for any such p, the sequence {p~'')y^jr is monotone non-increasing, 
it follows that the sequence {ey),j^jr is also monotone non-increasing. On the other hand, the 
proof of Theorem 13.91 shows that G This implies that (||tjy||y)jygj- G 

We cannot proceed in the same way for the Legendre coefficients |lni/||y and HwiyUvr since 
the right side Cr{u,p) in the estimates fl3.88p and fl3.87p do not have the monotone non¬ 
increasing property due to the presence of the factors 0{pj) and (1 -|-2^'j). Instead we slightly 
modify the construction of the sequence p = p{u) in the proof of Theorem 13.91 and show 
that the resulting sequence of estimates 


ru = r{u,p{n)), 


(3.161) 


has a monotone majorant which is summable over Here again, it suffices to work with 
the estimate fl3.87p which is the largest one. 

We use the same notation as in Theorem 3.9, in particular bj := For a constant 

/3 > 0 to be hxed later, we take J > 1 large enough such that 




(3.162) 


where e is the right side of the constraint fl3.48p . 


62 




























We now let z/ G -F and fix v and proceed to define an appropriate sequence p = p{u) for 
this V. Namely, using the same splitting of N into E and F, we take 


where b = {bj)j>i and 


Pj = K := 1 + 


Pj — K -\- (3 -\- 


3||6||,. 


j 6 £:nsupp((/), 


£Z/,- 




■, j e Fnsupp(i/). 


(3.163) 


(3.164) 


We again take pj = 1 if j ^ supp(z/). Therefore pj > 1 when j G supp(z^), and in addition 




j>i 




j>J 


+ [3bj + 7^^ ) < £. (3.165) 


3II&II 




si^fI 


which shows that the constraint fl3.48l) is satished. 

For this choice of p, the estimate fl3.87p may be written 

\\wi,\\v <ru = r(i/,p(i/)) = rE(i/)rip(i/), 


(3.166) 


with VEiv) as in the proof of Theorem 13.91 and a slightly modified rpiv) that incorporates 
the new form of pj for j G F. This new rplu) satisfies 


rip(z/) < rip(z^) := 9{k){ 1 + 2uj)(^/3 + 

jeFnsupp{u) 




?)bj\i>F\ 


< 9{E){l + 2vj 

jGFnsupp(y) 


evi 


?)bj\i>F\ 


Since k > 1, we there exists Co = Co(k) > 0 such that (1 + 2n) < Co(-^)" for any n > 1 
and so we can write 




p : = 


jeE 


1 + K 

2k 


< 1 and C = {Co9{K)y. 


(3.167) 


The same argument as in the proof of Theorem 13.91 shows that, up to choosing a larger J, 
the estimates 

fu ■= hE(z/)rip(z/), (3.168) 

are summable over E. 

We conclude by showing that is monotone non-increasing if B has been chosen 

large enough. On the one hand, since p < 1, it is readily seen that 


V < V ^ TEiv) < rE(z/). 


(3.169) 


For proving a similar monotonicity property for the second factor it suffices to show that 
fpiF) is reduced if we increase Vj by 1 for any j > J, that is 


fpiy + ej) < rip(z/). 


(3.170) 
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where Cj = (0,..., 0,1, 0,...) is the Kroenecker sequence with 1 at position j > J. In the 
case where fj ^ 0, we may write 


rpii' + ej) _l + 2z/j+2 + 


Zbj\up\ 


rpiv) 


< 


1 + 2vj 
2 


n 


O I e(uj+l) J-l V/3 + 

P + 3bji\uF\+l) ) feGFnsupp(i/)-{i} 


/3+ ^ 


SbkWp] 




36fe(|i^F|+l) 


n 


n I e{uj + l) 

p ' 3bj(|i/j^| + l) fcGFnsupp(F) 


^+ 3 ®;:} yk ^ 2/1 + \uf\\Pf\ 






36fc(|FF|+l) 


/3V \pf\ 


and therefore 


ri7(i/ + ej) ^ 2e 


In the case where Vj = 0, we have 
fFiy + ej) 3c«; 




fp^p) 


n 


o I E(^j + 1) 

P ' 36j(|ff|+1) A:eFnsupp(F)-{j} 


/S 


f3 + 


£Vk 

36fc|FF| 


/? + 


£Vk 


3bfc(|FF|+l) 


3c«.e 

< 




(3.171) 


(3.172) 


We thus find that {ry)^^jr is monotone non-increasing provided that (3 > max{2e, Sc^e}. □ 


Combining the above Theorem with (I3.159p . we obtain the following result. 

Corollary 3.26 Corollaries \3.1(A and \3.11\ remain valid, with the sets of corresponding 
to n largest terms in the seguences (||dlv)!yG.Fj (IIwIIv)fgJ' or {\\wy\\y)y^F, replaced by down¬ 
ward closed sets corresponding to the n largest terms in the monotone majorants of each 
of these seguences. 


3.9 Exponential approximation rates 

The rates of convergence that are established for polynomial approximations in Corollar¬ 
ies [SHO] and [TTT] are of algebraic type. We conclude this study of polynomial approximation 
by a brief discusssion on the circumstances where faster rates of exponential type can be 
established. For this, we focus on the finite dimensional case, that is, when finitely many 
fjj are non-zero in the affine representation fll.lSp . In such a case, one first obvious observa¬ 
tion is that since (||'0j||x)i>i £ -^^(N) for all values of p > 0, Corollaries 13.101 and 13.111 give 
convergence rates for all s > 0. However, a more detailed inspection shows that the 
multiplicative constant Cg obtained in front of this rate grows very fast to -|-oo as s —?> -|-oo. 
Instead of trying to search for a fast rate by optimizing Csn~^ over s for a given n, we 
return to the estimates on the polynomial coefficients and use them to obtain exponential 
convergence rates for the truncated series (I3.20p . (I3.22p or (I3.26p . 

Without loss of generality, we assume that only {ipi ,..., fjd} are non-zero, meaning that 
the scalar parameter vector is now 

y={yy...,yd)&U ■.= [-!,lY, (3.173) 
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and that the solution map y ^ u{y) from 17 to Id is hnite dimensional. Polynomial approx¬ 
imations are again based on truncation of the series fl3.2Up . fl3.22p or 03.261) . now with 

7 = (3.174) 

For the sake of simplicity, we focus our attention on Taylor series and make some further 
remarks on the case of Legendre series. 

A particularly simple case for estimates of Taylor coefficients is that of the disjoint inclu¬ 
sion model for the elliptic and parabolic PDFs 01.5p and 02.401) discussed in ^3.41 In this case, 
working under UEA(r), we explicitly solved 03.751) for any given 0 < f < r and obtained the 
estimate 

d 

\\tAv<Cp-’^ = Cl[pp (3.175) 

i=i 

with C = Ct and 

Pj = P*j = inf ^1^ >1, j = 1,..., d. (3.176) 

x£D \yjj{x)\ 

More generally, if we work under the assumptions of Theorem 12.81 we know from Lemma 
13.141 that we have an estimate of the form 03.1751) for any choice of pj > 1 such that 

d 

Y.^Pl-mi>l\\x<e. (3.177) 

i=i 

We may, for instance, take 

p, :=1 + — ^>1, j = l,...,J. (3.178) 

We thus again reach the estimate 03.175P with a fixed finite vector (pi,..., p^) independent 
of V and whose oordinates are strictly larger than 1. 

Based on such an estimate, a natural choice for the sets A„ is to pick the indices v 
corresponding to the n largest values of p“^. Equivalently, for any given threshold p > 0 we 
dehne 

A„ := {i/ G -F : p~'' > p}, where n = n(p) := G PF : p“^ > p}. (3.179) 

Notice that as we vary p > 0, it may be that not all values of n arise because of possible ties 
in the values of p~^. 

Let us now focus on the particular thresholds 

P = 2■^ A: > 0, (3.180) 

we may write, with n := n{k) growing with k, 

d 

An = Ffc := {z/ G J-” ; Aj< k}, Xj := log2(pj) > 0. (3.181) 

i=i 
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Sets of this type consist of all integer lattice points inside the simplex with bounding hyper¬ 
planes given by the coordinate hyperplanes to gether with the hyperplane ~ 

Note that these sets are downward closed. 

The cardinality of the above A„ is bounded from above by the volume of the continuous 
simplex 


d 


{(^1,- 

.., td) e 

IV 

1 

.. ,d, and 

• 

i=i 



IV 

1 

.. ,d, and 

d 

: Xjtj < k} 

i=i 


This gives the crude cardinality bound 


#(A„) = #{Sk) < \Tk 


1 



k + Eti 

A, 



< Ck'^ 


(3.182) 


where C depends on d and on (Ai,..., A^). 

Likewise, we may estimate the approximation error when retaining the n terms whose 
indices are in A„ by 


sup 

y&U 


u{y) - tuv’" 


< X] \\tu\\v 

u^Sk 

< : 2"^"^ < p-'' < 2"^} 
l>k 

l>k 


Using the estimate fl3.182p on the asymptotic growth of #(-5'^), we this hnd that 


sup 

y&U 


^(v)-Y.Kv‘' <Cj]2-'(i + l) 

h'ES'fe l'>k 

Combining this estimate with fl3.182p . we obtain 

“( 2 /) - tyV"' < C'exp(-c/c), 

uGSk 


d 


sup 

y&U 


which is equivalent to the exponential rate 


sup 

yeu 


u 


{y) - X] - Cexp(-cn 




(3.183) 


(3.184) 


(3.185) 


with multiplicative constants c and C that depend on d and on (Ai,..., A^). Since this rate 
is valid for all n of the form 4^{Sk) which grow like it is easily seen that it is also valid 
for all values of n > 1, up to a change in the multiplicative constants. 
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Remark 3.27 We notice that this exponential rate deteriorates as d grows, due to the power 
1/d, as well as to the hidden dependence on d in the constants c and C. However in the case 
where the H norm of {\\'iljj\\x)j=i,...,d remains uniformly hounded for some 0 < p < 1 as we 
raise d, our analysis of the infinite dimensional case always ensures the algebraic rate n~^ 
with s : = i — 1. 

p 

Remark 3.28 A similar analysis leads to the same exponential rates for approximation by 
truncated Legendre series, now under the assumptions of Theorem \2.9l based on the estimates 
fl3.87p and fl3.88p . up to a proper treatment of the algebraic factors 0{h'j) and (1 + 2z/j) 
appearing in these estimates. 

4 Estimating the n-widths of solution manifolds 

We have already noted that, when approximating the solntion map by separable expansions 
of the form fll.32p or fll.33p . the best achievable error in L°°{A,V) or in L°°{U_a,V) is 
described by the n-width of the solution manifold Ai = u{A) in V, that is, 

dn{Ai)v ■= inf sup min ||n — tally (4.1) 

dim(Vy)=n v€M 

In this section, we use the polynomial approximation results established in the previous 
section to derive a priori estimates for the decay of dn{Ai)v- 

4.1 Estimates of n-width by polynomial approximation 

In the case where Assumption A holds, we may use the polynomial approximation results 
of §3 to estimate dn{Ai)v from above. Indeed, if Un{y) = i® ^ polynomial 

approximation to the map y ^ u{y) for some set A„ C of cardinality n, we dehne the n 
dimensional space 

14 := span{c,, : v G A„} C V, (4.2) 

and observe that 

dn{Ai)v < sup min ||n — w\\v = sup min \\u{y) — w\\v < \\u — Un\\L°°{u,v)- (4.3) 

V&M v&Ua 

Therefore a polynomial approximation bound in L°°{U,V) induces an estimate on the n- 
width of Ai in V. Combining this observation with Corollary 13.111 we obtain the following 
result. 

Corollary 4.1 Consider a parametric problem of the form (ll.l|i such that Assumption A 
holds for a suitable affine representer (I1.15|) . Assume that the solution map u u{a) admits 
a holomorphic extension over an open set O <Z X which contains the compact set a{U) and 
this extension satisfies the uniform bound 

sup ||M(a)||y < C. (4.4) 

aeo 
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//(IIV’illx)j>i £ ^^(N) for some 0 < p < 1. then 

dn{M)v < C{n + ly, n>l, s := -1, (4.5) 

P 

for a suitable constant C. 

Proof: We consider the truncated Legendre expansion 

I'GAn 

where is the set of indices corresponding to the n largest ||tCi/||y. Since the assumptions of 
Corollary 13.111 are satished, we obtain (I4.5p with C := ||(||tCj,||\/)i,gj-||£p, by combining (14. 3 h 
and fl3.42p . □ 

One drawback of the above result is that it requires Assumption A. For some natural 
examples of compact sets A of A, this assumption may not hold. For instance, in the case 
of the elliptic equation fll.Sp . we know that the standard compact sets of A = L°°{D) are 
described by a smoothness assumption. A typical example for A of this type is 

A := {a G A : a > r, \\a,\\c 0 < M}, (4.7) 

for some M,[3,r > 0, where := C^{D) is the Holder space with smoothness (3 > 0, 

equiped with its usual norm 

||a||c/3 := sup ||(9"a||L°° + sup sup |a; — a;'|“‘'^“”^^|(?"a(x) — (9"a(x')|, m:=[/3\. (4.8) 

|a|<m \a\=mx,x'GD 

For such A, there are many ways to choose an a G A and a properly normalized basis 
such that expanding a — a in this basis allows us to write 

A C a{U), (4.9) 

with a{U) of the form fll.281) . However, it will generally not follow that there is an r' > 0 

such that for each a in a{U), we have a > r'. Therefore, we are not guaranteed to have well 

posedness of the PDF for all u{y) E U and so Asssumption A will not hold for this affine 
representation. We fix this defect in the next section by a different approach based on local 
polynomial approximations. 

4.2 Estimates of n-width by local polynomial approximation 

In this section, we treat parameter sets A G A which may not have Assumption A . We 
assume that {'ipj)j>i is a complete representer for A and in addition that (||'0il|x) G -^^(N). 
It follows that for each G U, the series J2j>i converges in A and so the set 

77 ^ Zj'ipj : z = {zj)j>i G w|. 

i>i 
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(4.10) 










is well defined. We replace Assumption A by the requirement 


Ac7^. (4.11) 

Notice that, in contrast to A, the set TZ might not be contained in the open set O over which 
the solution map admits a bounded holomorphic extension. However, we will remedy this 
problem by using the following covering result. 

Lemma 4.2 Let A he a compact set in a complex Banach space X, and assume that A G TZ 
wherelZ is of the form fld.lOp for a family of functions {'ipj)j>i such that (||'0j||v:)j>i £ 

Let O he any open set of X which contains A. Then, there exists ri,e > 0, an integer J > I 
and a finite collection {ai,..., om} C X such that defining 

i = j > J, (4.12) 

and for any seguence z = {zj)j>i G 

ai(z) ;= Oi + ^ Zj'tpj, i = 1,.. .,M, (4.13) 

i>i 

whenever the series on the right converges, the following holds: 

(i) The compact set A admits the following cover 

M 

AcljAi, Ai ■.= ai{U) = {ai{z) : zeU}. (4.14) 

i=\ 

(ii) The compact sets Ai, ■? = !,..., M, are all contained in O. 

(hi) For any seguence p = {pj)j>i of numbers, each larger than 1, which satisfies the con¬ 
straint ~ l)lll/^jllx: < £, there exists, for each j > 1, an open set Op. C C which 

contains the disc {\zj\ < pj} and for which the set Op := ®j>iOp. satisfies 

ai{Op) := {ai{z) : z G Up} C O. (4.15) 

Proof: Similar to the proof Theorem 12.81 we first observe that since A is compact, there is 
an £ > 0 sufficiently small, such that the 3e neighborhood of A is contained in O, that is, 

[j B{a,3e) cO. (4.16) 

For this e, we next choose J > 1 large enough so that 

(4.17) 

j>J 


We then define 








This fixes the £, r] and J claimed in the theorem. In going fnrther, we nse the notation 


Uj:={zeU-. Zj = 0, j > J}. (4.19) 

Since A C 71, for any a E A there exists a. z eU snch that 

,7 

a = ^ Zj^pj + ^ Zj'ijjj =: aj + {a - aj). (4.20) 

1=1 j>J 

Note that this decomposition may not be nniqne - since the xjjj are not assnmed to be linearly 
independent - bnt, for each a G we assign one snch decomposition. We can find a finite 
set F C Uj, such that, for each z eUj, there \s a z' E F such that 


\\z - ;2'||£°°(n) < V- (4-21) 

We let {oi,..., om} be the finite set consisting of all elements in X of the form 

.7 

ai = '^z'j'ipj, (4.22) 

1=1 


where z' E F and in addition there is an a = ^iV'i ^ *4,, such that 

\zj-z'j\<r], j = (4.23) 

Let us now show (i). If a G ^ and a = then according to (I4.21j) and (I4.23p . 

there is a Hi such that 

,7 

aj - 0,1 = \cj\ < r], (4.24) 

1=1 

which implies that a E Ai. 

Next, note that (iii) implies (ii). Indeed, take any p satisfying the assumptions of (iii), 
then hi <Z Op and hence the validity of (iii) implies Ai C OiiOp) C O for each 1 = 1,..., M. 

We are left to prove (iii). For this, let p be any sequence satisfying the constraint in (iii) 
and define for each j > 1, the sets 


^Pj {l^ll ^ Pi); 


Pj ■- Pj + 


E,>i W'l’iWx 


(4.25) 


We need to check that OiiOp) <Z O, i = 1.2...., M. For this, we fix any value of i. We know 
that Oi = X]j=i ^iV'i) cind from fl4.23p . there is an a* = ^ which \zj — z*\ < p 

for j = 1,..., J. In view of fl4.17l) and the definition of p, we have 



(4.26) 
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Now take any a G a*(Op), that is 


a = Oj + ^ Zjijjj, 
i>i 

with = {zj)j>i G Up. We define 

Zj = Zjmm{l,\zj\~^}, j > 1, 

so that {zj)j>i is a point in U. We can now estimate 


\a — a*\\x < 
< 


a ~ O'iWx + \\o.i — 0, ||x 
e 

+ 7t 
X 2 


i>i 
.7 


i=i 

,7 


j>J 


X 




i>i 


£ 

+ X 
X 2 


< iiV'iiix + iiv’jIU + II 

j>J 

- ^j)t 

i>i 




< 


i=i 
e e 

4 + 4 


i>i 

e 

x + 2- 


£ 

x+2 


Since 
we obtain 


\Zj -Zj\ < {pj - 1) < Pj - pj + Pj - 1, j > 1, 


(4.27) 


(4.28) 


||a - a*\\x <e + '^{pj - Pj)||^il|x + '^{Pj - l)ll^illx <e + e + e = 3e, 
i>i i>i 

where we have used (14.251) to bound the first sum and the assumption on {pj)j>i in esti¬ 
mating the second sum. This shows that a belongs to the Se-neighborhood of A which is 
contained in O. Therefore ai(Up) C O. □ 


With the help of the above lemma, we now establish a result which shows that the 
conclusion of Corollary 14.11 remains valid without the assumption that A is of the exact form 
a{U). 

Theorem 4.3 For a parametric problem of the form fll.ll) . assume that the solution map 
u I—)■ u{a) admits a holomorphic extension over an open set O of the complex Banach space 
X which contains A, with uniform bound 

sup ||M(a)||y < C. (4.29) 

a&O 

Assume in addition that there exists functions in X such that (||'0j lUtei e <'(N) 

for some 0 < p < 1, and such that A <ZTZ, where IZ is of the form fl4.10p . Then, there exists 
C* > 0 such that one has 

dn{Xi)v < Cn~^, n> 1, s := -1. (4.30) 

p 
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Proof: Applying Lemma [4.21 we write 


M 

^ c IJ A, 

i=l 


(4.31) 


and therefore 


M 

Mc\Jm„ Mr-=uiA,). 

i=l 


(4.32) 


It now sufficient to prove that the estimate (I4.30p holds for each Aii in place of Ai, that is 


dn{A4i)v < Cn n > 1, 



1 . 


(4.33) 


Indeed, if for each i = 1,..., M one can approximate all elements of Aii with accuracy 6 
by elements from an n dimensional space Vn^i-, then one can approximate all elements of Ai 
with the same accuracy by elements from the space 14,i © • • • © 14 ,m which has at most 
dimension nM. This shows that 


dMn{Ai)v < max dniAii)v, (4.34) 

and therefore (14.331) implies (I4.30p up to a change in the constant C. 

The proof of fl4.33l) follows from Corollary 13.101 We fix i E M}, we know that 

Ai C a,(U), (4.35) 

where ai(z) := a* + X] 7 >i If follows that the assumptions of this corollary are satisfied 
for Ai and Oj in place of A and a. This confirms the estimate fl4.33p and therefore concludes 

the proof. □ 

Remark 4.4 The above theorem can be formulated for a general map u from A to V that is 
not necessarily the solution map of parametric PDE, following the arguments from Remarks 
\2Mand\3AM 


4.3 n-widths under holomorphic maps: a general result 

In this section, we let u be any map from A to V, not necessarily the solution map to a 
parametric PDE. In view of Remark 14.41 Theorem 14.31 gives an estimate for the n-widths 
oi Ai = u{A) whenever u has a bounded holomorphic extension to a neighborhood of A, 
provided that A is contained in a set 71 of the form fl4.1Up with (||4illx:)i>i ^ with 
0 < p < 1. Notice that the containment and P’ summability assumptions on A imply a 
decay on the n-width of A in X. Namely, for s ^ — 1, we can write 


dn{,A)x < sup min 
aeA 


a 


n 


j=i 


<'^\\f^j\\x <c{n + l) *,n>0, 

j>n 


(4.36) 


72 


























where we have used Lemma [3.61 Thus, in a certain sense, the results of the previous section 
can be interpreted as saying the Kolmogorov widths of M inherit the decay rate of the 
widths of A. Since, the sets A are generally more accessible and their n-widths are more 
readily computed, it is natural to ask whether there is a general principle in effect here. 
That is, do the n-widths {dn{Ai)v)n>i of an image Ai = u{A) of a compact set A under a 
general holomorphic map u have the same decay as that of {dn{A)x)n>i- The main goal of 
this section is to show that there is indeed such a general principle in effect, however with a 
slight loss in the decay rate of the widths of Ai when compared with those of A. 

Our first step in deriving such comparison results is to show that whenever a compact set 
^ of a Banach space X has widths {dn{A)x) with some prescribed decay, then A is contained 
in a set TZ of the form fId.lOp where the X-norms of the rjjj defining TZ are P’ summable for 
certain values of p. For this, we need the following classical result due to Auerbach, the 
proof of which is given below for completeness. 

Lemma 4.5 Let E be an n-dimensional subspace of a complex Banach space X. Then, there 
exists a basis {pi ,..., for E and a dual basis {(pi ,..., (pn} in X' such that 

{ThTj)x',x = dij, i,j = l,...n, (4.37) 

and 

\\Ti\\x = \\Ti\\x'= i = l,...,n. (4.38) 

Proof: We start with an arbitrary basis ipi,... oi E and lei fji,... ,'ipn be its dual basis 
in E', that is 

= 6ij, (4.39) 

where 6 is the Kronecker delta and (•, •) := (•, ■)x',x throughout this proof. Then, any f E E 
can be uniquely written as 

n 

f =f)i^i- (4.40) 

Given any (^fi,..., pn) G 77”, we define 

J{9i, ■■■,9n) = \ det{M{gi,.. .,gn))\, M := ((^^i,5'j))ij=i,....n. (4.41) 

We now take [pi ,..., pn) E E'^ such that 

{ifi, ...,ipn)-= argmax J( 5 (i,.. .,gn), (4.42) 

where the maximum is taken over all (^fi,..., gAj E 77" such that \\gi\\x = 1 , for i = 1,..., n. 
This maximum is attained since the function J is continuous and we are maximizing over 
a compact set. The functions ipi,...,ipn are linearly independent since this maximum is 
positive. Hence, they form a basis for E and any f E E can be written uniquely as 

n 

f = '^{Ti,f)Ti (4.43) 

i=l 
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where i = 1,... ,n, is it dual basis. Applying the functional to both sides of fl4.43p . 
we obtain 

n 

= f = l,2,...,n (4.44) 

i=i 

From Cramer’s rule, it follows that, for any j G {1,..., n} and any f ^ E, 




J(^ipi, . . . , ipj—i, /, . . . , ‘-Pn) ^ 

J{ipi^ . . . , ipn) 


(4.45) 


This proves that for each j, we have ||<^j||E' = 1. By application of the Hahn-Banach theo¬ 
rem, we can extend (pj over all of X with ||(,Pj||x' = 1- n 


Using the Auerbach Lemma, we now show that whenever ^ is a compact set of a complex 
Banach space X and dn{A)x has some prescribed rate of decay, then A is contained in a set 
TZ of the form fl4.10p and the X-norms of the il^j defining TZ have the same rate of decay as 
dn{A)x ■ 

Lemma 4.6 Let X be a eomplex Banach space and .4, C X he a compact space such that 



supn^d„(4.)x < oo- 

n>l 

(4.46) 

Then, there exists a family ('4j)i>i 

of functions from X such that 



supillV’ilU < 

i>i 

(4.47) 

and 

A<ZTZ: = 

i>i 

(4.48) 


Proof: From 04.461) . we know that here exists a constant C > 0 and a sequence of spaces 
(yk)k>o with 14 C X and dim(14) = 2^, such that 

max min ||a — g\\x < k > 0. (4.49) 

By replacing 14 by Vq +14 + • • • + 14-i and possibly changing the constant 
that the spaces 14 are nested: 14_i C 14, for all k>l. 

For any a G 4., we denote by Ok a best approximation to a from 14 
a_i := 0. Then, Qk := Ok — ak-i is in 14, and we have 

a = ^9k- (4.50) 

k>0 

In addition, there exists a constant C > 0, such that 

\\gk\\x < C2-^^ k>0. (4.51) 


C, we may assume 
for /c > 0 and set 
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By Auerbach’s lemma, for every /c > 0, there exists a basis of tho space 14, and 

a dual basis C X' such that ||(^a:,z||x = ||‘^fc,z||x' = 1- It follows that any a & A 

can be written as 

2k 

= ^ (4.52) 

k>0 1=1 

Each integer j > 1 can be written uniquely as j = 2^ + / — 1 with / G {1,..., 2^}. We use 
this to dehne 

4,-:= C'2-^Vm, J = 2^ + l-l. (4.53) 

This gives that any a G ^ is of the form 

a = ^Zj'ipj, \zj\<l, (4.54) 

i>i 

that is, A gTZ. In addition, we have 

||4ilU < Cj~", (4.55) 

up to a change in the constant C. □ 

An immediate consequence of the above lemma is that if dn{A)x has the rate of decay 
n~^, then (||4illv)j>i G ^^(N) for any p such that sp > 1. Combining this observation with 
Theorem 14.31 leads to the following result which shows that the rate of decay of n-width is 
almost preserved under holomorphic maps, up to a loss of 1 in the rate. 

Theorem 4.7 For a pair of complex Banach spaces X and V, assume that u is a general 
holomorphic map from an open set O G X into V with uniform bound 

sup ||M(a)||v 4 C. (4.56) 

a£0 

If A C O is a compact subset of X and M. = u{A), then for any s > 1 and t < s — 1, 

snpn^dn{A)x < oo ^ supn*(i„(A4)\/ < oo. (4-57) 

n>l n>l 

Some comments on this result are in order. If u was a linear map, could write for any 
subspace Xn C X of dimension n and any a E A, 

min ||n(a) — n||y < C iiiin ||a — a||x, C ■.= \\u\\c(x,v)i (4.58) 

veVn aex„ 

with 14 := u{Xn) C V also of dimension n. Therefore, we would obtain 

dn{.Ai)v < Cdn{A)x-, (4.59) 

which implies that dn{Xt)v has at least the same rate of decay as dn{A)x- Theorem 14.71 
shows that holomorphic maps behave almost as good as linear maps, except for the loss of 1 
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in the rate expressed by the inequality t < s — 1. This loss occurs due to a lack of sharpness 
in Lemma [4.61 if we start from the conclusion ||V^j||x < Cj~^ of this lemma, we may only 
retrieve that 

dn{A)x < dn{TZ)x < ^2 (4.60) 

j>n 

An open question is if the implication 04.571) in Theorem 14.71 remains valid with t = s. 

4.4 Towards faster low rank approximations 

We close the hrst part of this article with some remarks concerning our approximation 
results. As explained in the introduction, our general interest is in the accuracy of separable 
approximations of the form 01.32p and 01.33p . These approximations can be thought of as 
the analog of low rank approximations for hnite dimensional matrices. 

Optimal approximations are provided by best optimal n-dimensional spaces W either 
in the sense of n-widths for uniform approximation or Karhunen-Loeve decompositions for 
approximation in the mean square sense. Since these spaces are out of reach, both from a 
theoretical and computational point of view, we build sub-optimal approximations y Un{y) 
based on best n-term truncations of polynomial expansions. This approach leads us to 
quantitative convergence results such as in Corollaries 13.101 and l3. Ill and in turn to estimates 
for the decay of the n-widths dn{M.)v of solution manifolds as discussed in §4, for example 
by using the estimate 

dn{M.)v < \\u — Un\\L°°{uy), (4.61) 

in the case when Assumption A holds. 

A legitimate question is to evaluate the possible lack of optimality of the convergence 
rates, obtained by our polynomial approximation approach, in comparison to the rates which 
could be achieved by using the optimal n-dimensional spaces Vn- Equivalently, we would like 
to know if the rate of decay of the n-width dn{M.)v could sometimes be much faster than 
the rate of decay of the polynomial approximation error on the right of fl4.61l) . 

We can give simple examples which reveal this lack of optimality in the case of the elliptic 
equation fll.5p . Here, we consider the hnite dimensional setting where 

d 

a{y) =ay^yjXpj. (4.62) 

j=i 

In this setting, convergence rates of exponential type 

||m - Un\\L^[u,v) < Cexp(-cn^/'^), (4.63) 

are established in §3.9h ising for Un the Taylor series truncated with the index set correspond¬ 
ing to the n largest values of ||tjy||y. 

Let us here consider the particular case of a piecewise constant diffusion coefficient of the 
form 

d 

(^{y) = + dyj)XDj, (4.64) 

t=i 
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where {Di,... ,-Dj} is a partition of D, that is, a = 1 and ^|Jj = 9Xd - We assume that 
0<6' = 1 — r<lso that UEA(r) holds. 

We first examine the case where D is a one dimensional interval partitioned into sub¬ 
interval Dj. The problem now reads 


- {auj = /, (4.65) 

with homogeneous Dirichlet boundary conditions at the endpoints of D. Since a{y) is con¬ 
stant on each interval Dj, we hnd that the restriction of u{y) to this interval is always the 
sum of an affine function and of a scalar multiple of F such that F” = f . It follows that, for 
any y G [—1,1]'^, the solution u{y) belongs to the finite dimensional space 

l/gd = span{XDi, xXoi, FXoi '■ * = 1, • • • d}, (4.66) 

where x stands for the identity function x x. Using the fact that u{y) is 0 at the endpoints 
of D and continuous at the breakpoint between the Dj, we find that it belongs to an even 
smaller subspace of Vsd that has smaller dimension 2d — 1. This implies that 


dn{,M.)v — 0, 


(4.67) 


for n > 2d — 1, therefore showing that the rate in the right-hand side of fl4.63l) is not sharp 
for dn{M)v- 

Let us now examine a less trivial case where D is a. domain in higher dimension m> 2. 
In such case, it is not true that M belongs to a finite dimensional space, however we can still 
show that the rate in the right-hand side of (I4.63p is not sharp for dn{Ai)v- For simplicity, 
consider the case of a two domains partition, that is, d = 2. Since ||'0i||x = ||'02||x = d, the 
sets A„ that are used in ^3.91 to obtain the rate 


\u - Un\\L^(u,v) < Cexp{-crd^^), 


(4.68) 


have the simple structure 


An = {\l^\ = l>l + l>2< k}. 


(4.69) 


for integers /c > 0. Therefore, we use polynomial approximations of total degree k of the 
form 

Un{y) = Uy’^, (4.70) 

\D\<k 


which have accuracy 

\\u - Un\\L°-(uy) < Cexp{-ck), (4.71) 

with n = ~ fcT 

This trunctated power series can be interpreted in a different way by writing the elliptic 
equation in operator form 


B{yHy) = f, B{y) =B + y,B, + y^B^, (4.72) 
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where 


Bu := —Am and BjU := —div(0XDj Vm). (4.73) 

With B ^ the inverse of —A on D with honiogeneons Dirichlet bonndary condition, we may 
then rewrite the eqnation as 

(J + y,B, + y^B^My) = g, g := B~"f, Bj := B'^B,, j = 1,2. (4.74) 

It is easily seen that the Taylor series of u{y) coincides with the Nenmann series 


«(!/) = + vAYg- ( 4 . 75 ) 

l>0 

The convergence of this series can be directly checked by observing that 

Ibi^i + 2/2^2|U(v,y) < 6^, {yi,y2) £ [“1)1]^- (4.76) 

In particular this confirms the exponential rate 

||w — Un\\L°°{u,v) A C9^ = Cexp(—cfc). (4.77) 

We now observe that, due to the fact that Xdi + Xd^ = Xd, we have the identity 

Bi + B2 = 91. (4.78) 

We may therefore rewrite each term in the Neumann series as 

{-l)\y,Bi + y2B2)^g = {-i)\y29l + {yi-y2)B,)^9 

= (-1)' ^(i/20)'-^' {yi - y2yB{g. (4.79) 

j=o 


Therefore, summing the terms in (14.791) from / = 0 up to fc, we may rewrite Un{y) as 

k 


with 

and 


Un{y) = '^Vj(l)j{y), 

(4.80) 

j=0 


Vj :=&^g e V, 

(4.81) 

(/>j(y) ■= (yi - y^y 5^(-i)'(i/20)'”^ [0 • 

i=i kJ/ 

(4.82) 


This new representation of Un{y) shows that it belongs to the k + 1 dimensional space 


14 = span{Mo, • • • Wfc}- 


(4.83) 


78 




We may thus conclude that 


dk+i{M)v < Ce’^ = C'exp(-cA;), (4.84) 

Since k ~ ^/n, this shows that the rate in the right-hand side of fl4.68p is not sharp for 
dni-M)v- 

These examples reveal that in certain relevant cases, polynomial approximations based 
on best n-term truncations may be highly sub-optimal in comparison to the n-width spaces. 
Note, however, that the rank reduction is made possible due to hne properties of the affine 
representation fll.151) . such as the fact that the ipj are characteristic functions with disjoint 
supports. For other affine representations with general functions ipj which have overlapping 
support, numerical computations show that polynomial approximation rates are sometimes 
close to the optimal rates to be expected from arbitrary separable approximations. The 
development of alternate strategies for a sharper convergence analysis of separable approxi¬ 
mations is thus desirable, and it inevitably requires exploiting the detailed structure of the 
affine representation. 


Part II. Algorithms for parametric PDEs 
5 Towards concrete algorithms 

The results exposed in the hrst part of this paper show that relevant instances of parametric 
PDEs admit separable approximations Un of the form fll.321) or fll.33p with error bounds that 
reflect a certain rate of convergence in terms of the number n of terms that are retained. 

However, these approximations are obtained by mathematical techniques which, as such, 
cannot be implemented through a computational algorithm. For example, in order to com¬ 
pute the best n-term truncation of the Legendre series we need in principle to be able to 
compute exactly all Legendre coefficients Wy and to search for the n largest values of ||tCy||y. 
This is unfeasible for two reasons: (i) we can only compute the Wi, with limited precision 
due to spatial discretization, for example through a hnite element space 14 of V, and (ii) we 
cannot perform an exhaustive search through the inhnite set of multi-indices. 

In this second part of the paper, we discuss concrete numerical methods which compute 
separable approximations Un, still of the form fll.321) or fll.33p . however at an affordable 
computational cost. 

5.1 Space discretization and computational cost 

Our approach to the computation of such approximations can be viewed as follows: 

(i) We develop and analyze strategies for computing separable expansions hrst based on a 
few instances of the exact solution maps a ^ u{a) and y u{y), or quantities related 
to these maps such as the Taylor coefficients 
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(ii) We then instead apply these strategies to the approximate solution maps 

a H- Uh{a) e 14 and y H- Uh{y) G 14, (5.1) 

which correspond to a certain space discretization process for each instance of the 
solution map in a hxed discretization space 14 - 

Ideally, we would like to obtain error bounds for these approximations which meet the 
benchmark established in the hrst part of the paper in terms of their decay as n grows, up 
to an additional term that reflects the space discretization error. 

We assume that space discretization can be performed within a certain hnite element 
space 14 of dimension W, through a numerical solver which we may apply for each individual 
instance of a G ^ or |/ G t /4 to compute approximate solutions Uh{a) or Uhiy) from 14. For 
simplicity we assume 

(i) A cost Cfi for computing Uh{a) or Uh{y) that is independent of a or y. 

(ii) An error bound 


sup \\u{a) - Uh{a)\\v = sup \\u{y) - Uh{y)\\v < £{h), (5.2) 

aSvA y&Uj^ 


therefore also independent of a or y. 


Recall that making e{h) small requires to make large and Ch even larger, which is 
one of the motivations for reduced modeling. 

As an example of such a space discretization, consider the elliptic equation (11.51) . We may 
then dehne the discrete solution by the standard Galerkin method on I 4 , that is, Uh{a) G 14 
is dehned by 


j a'\/uhia)'\/vh 


j fvh, Vh G 14. 


(5.3) 


D 


D 


We may then use classical techniques of hnite element approximation of elliptic PDFs, see [20] 
or [S], in order to obtain an error bound of the form 05.21) . First, assuming that 0 < r < a < R 
for all a & A, Cea’s Lemma ensures that 


\u{a) - Uh{a)\\v < 


R . II / N 

mm Mia) 
r vh€Vh 


Vh\\v- 


(5.4) 


Then, if (14)/i>o are Lagrange hnite elements spaces of polynomial degree m > 1 associated 
to a regular family of conforming simplicial partitions {Th)h>o with mesh size h > 0, we have 
for 1 < r < m + 1 the classical approximation bound 

min \\u{a) — Vh\\v ^ Ch'^~^\\u{a)\\H^m)- (5.5) 

We therefore obtain an error bound 05.2p with e{h) ~ provided that u{a) is bounded in 
H^{D) independently of a G 
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Remark 5.1 Our approach to space discretization means in particular that, when computing 
polynomial approximations by trunctated expansions, the Taylor or Legendre coefficients are 
discretized in the same finite element space Vh, independently of their index v. An alternate 
approach, which we do not embark in here, is to search for space discretizations of these 
coefficients which vary with u, with the objective of optimizing the total number of degrees 
of freedom reguired to reach a given accuracy. This approach is analyzed in for 

Legendre and Taylor series. See also for computational approaches based on a global 
adaptivity both in the parameter and space variable. 

When evaluating the total computational cost for computing the separable approximation 
Un, we make the distinction between two types of cost: 

(i) The offline cost refers to the computation of the functions Vi,... ,Vn which are used in 
fll.32p or fll.33p . or equivalently of the space 14 := span{ni,... ,Vn} which is used to 
simultaneously approximate all members of the solution manifold Ad. 

(ii) The online cost which refers to the computation of the approximate solution Un{ci) or 
Un{y) from 14 for any given query a E A or y E U . 

One can view the offline cost as a “one time only” hxed cost, while the online cost could be 
repeated many times in certain applications of reduced modeling. 

5.2 Polynomial approximation algorithms 

The hrst class of numerical methods that we study searches for computable polynomial 
approximations of the general form fll.SOp . For any hnite set A, we dehne the space 

14 := 1 ^ <8 Pa, (5.6) 

of Id-valued polynomials associated to A, where 

Pa := span{|/ ^ y'^ : v E A}, (5.7) 

is the corresponding space of real valued polynomials. Therefore a polynomial approximation 
of the form fll.50p belongs to Va„. 

There are two main issues in the design of these methods : 

(i) Given an index set A„, how do we construct the polynomial approximation fll.50p . 

(ii) How do we select the index sets A„. 

For treating both of these issues, it is very useful to impose that the considered sets A„ are 
downward closed, which we assume in going further. 

Concerning the hrst issue, we present two different strategies which illustrate the impor¬ 
tant distinction between non-intrusive and intrusive methods mentionned in the introduc¬ 
tion. 
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The first strategy, discussed in §5, is non intusive. It computes a polynomial approxima¬ 
tion of the form fll.501) by interpolation of the solution map at well chosen points ... G 
f/ by a method introduced in [TB], in the line of [7I1|72]. In particular it could even be applied 
in a context where the exact model is not known, but only the solver is given. Other impor¬ 
tant representatives of non-intrusive methods, which we do not discuss in this paper, include 
least-square projection methods as developed in |T6l [SU |35], and pseudo-spectral methods as 
developped in [2H1 E] . 

The second strategy, discussed in §6, performs an explicit computation of the truncated 
Taylor series, up to the spatial discretization of the coefficients t^, hy a. recursive method 
introduced in HU. In contrast to the previous one, this approach is intrusive. It strongly 
exploits the particular form of the parametric PDE, and actually it can only be easily imple¬ 
mented for parametric problems (11.11) where V is linear both in u and a. Other important 
representatives of intrusive methods, which we do not discuss in this paper, include Galerkin 
projection methods as developed in [U [5l [23l 02]. 

Concerning the second issue, an important distinction should be made between non- 
adaptive and adaptive methods. In non-adaptive methods, the selection of the set A„ for a 
given value of n is done in an a priori manner, based on available information on the problem. 
Ideally we would like to use the set A„ associated to the n largest coefficients in the Taylor 
or Legendre expansion, however this set cannot be easily identified. Instead, we consider the 
set Kn associated to the n largest a priori estimates obtained in §3 for the C-norms of these 
coefficients. We detail further in ^5.31 the algorithmic construction of the sequence (A„)„>i 
by this approach. 

In adaptive methods, the selection of A„ is made in an a posteriori manner, based on the 
computation for downward closed values of n, for instance using the knowledge of both the 
previous choice A„_i and the computed approximation polynomial Un-i for this choice. One 
reason why adaptive methods might perform significantly better than their above described 
non-adaptive counterpart in the present context is because the a priori bound e^ may lack 
sharpness and therefore only gives a limited indication on the real set of the n largest 
coefficients. In particular, the guaranteed rate n~^ based on these a priori bounds may be 
too pessimistic, and a better rate could be obtained using an adaptive method. However, the 
convergence analysis of adaptive methods is usually much more delicate than that of their 
non-adaptive counterparts. We give examples of adaptive strategies both for interpolation in 
§5 and Taylor approximations in §6, convergence analysis being available only for the latter. 

5.3 Non-adaptive constructions of the sets A„ 

We recall that the a priori estimates obtained in §3 for the Taylor or Legendre have the 
following general form: 

• For the Taylor coefficients, under the assumptions of Theorem 12.81 

\\tu\\v<Ceu, Cj, := (5.8) 

iesupp(!^) 
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for any given sequence p = {pj)j>i of number larger than 1 that satishes the constraint 

• For the Legendre coefficients, under the assumptions of Theorem 12.91 

\\wy\\v <Cey, ey\=C 6{pj){l + 2vj)p~''\ (5.9) 

iesupp(i/) 


for any given sequence p = {pj)j>i of number strictly larger than 1 that satishes the 
constraint 03.481) . 

Also recall that for certain specihc problems, we can sharpen these estimates by improving 
on the constraint 03.48^ imposed on p, see §3.41 Once an admissible sequence p = piy) 
has been hxed for each z/, each resulting estimate is computable as a product of ||i/||o 
numbers. In the proof of Theorem 13.91 we use particular choices of admissible sequences 
p = piy) which ensures the P’ summability of the resulting provided that (||'^j|lx)j>i is 
P summable, for some p < 1. However, one may hope to further improve the estimate by 
using other sequences. 

An important observation is that the above general dehnition of does not guarantee 
that the set A„ corresponding to the n largest is downward closed. Indeed, we are not 
ensured that the sequence {ev)v^j: dehned in fl5.8p or fl5.9p is monotone non-increasing, in 
particular due to the fact that the sequence p is allowed to vary with v. One may try to 
construct the sequences piy) such that the sequence (eiy)i,^jr is monotone non-increasing. 
However, a simpler possibility is to search instead for a surrogate Si,, with 

<s^\= JJ (5.10) 

iGsupp(i/) 

where the Sj{v) are again explicitly given, and in addition {su)y^jr is a monotone non¬ 
increasing sequence. Then, we know that at least one of the sets A„ corresponding to the 
n largest Sy is downward closed. One example of such a surrogate in the case of Legendre 
coefficients is given hy Sy := fy dehned in fl3.168p . for which P summability is also established 
provided that (||V'illx)j>i is P summable, for some p < 1. 

We now discuss the complexity of identifying the downward closed set A„ associated to 
the n largest Sy. In addition to the monotonicity of {sy)y^jr, the following property is useful 
for limiting this complexity. 

Definition 5.2 A monotone non-increasing positive sequence {sy)y(zjr is said to be anchored 
if and only if 

l<j^Sej<Sei, (5.11) 

where ci and Cj are the Kroenecker sequences with 1 at position I and j, respectively. 

This property implies that at least one of the sets A„ corresponding to the n largest Sy 
has the following property. 
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Definition 5.3 A finite downward closed set A is said to he anchored if and only if 

Cj G A and I < j ^ ei & A. (5.12) 

where ei and Cj are the Kroenecker sequences with 1 at position I and j, respectively. 

We now show that for an anchored seqnence {sfijytzjr the identification of the set A„ 
can be executed in at most n^/2 evaluations of Sy. For this purpose, we introduce for any 
downward closed set A its set of neighbors defined by 

N{A) := {z/ ^ A such that A U {z/} is downward closed}, (5.13) 

We also intoduce the set of its anchored neighbors defined by 

7V(A) := {z/ e N{A) : = Q if j > j{A) + 1}, (5.14) 

where 

j{A) := max{j : vj > 0 for some u G A}. (5.15) 

If (sy)i/Gj- is an anchored sequence, we may define the sets A„ = {z^\ ..., z/"} by the following 

induction: 

• Take z^^ = 0 the null multi-index. 

• Given A^ = {z/^,..., z/^}, pick a maximizing over o G N{Ak) and such that the 
new set A^+i is anchored. 

We observe that N{Ak) is contained in the union of N{Ak_i) and of the set consisting of the 
indices 

and + j<j{Ak). (5.16) 

Therefore, since the values of the Sj, have already been computed for z/ G N{Ak-i), the step 
k of the induction requires at most j{Ak) + 1 evaluations of s,^. In addition, the fact that A^ 
is anchored implies that j{Ak) < k — 1. Therefore, the total number of evaluations of Si, in 
order to reach A„ is at most 

Nn = l + 2 +... + {n-l) < nV2. (5.17) 

Finally, let us observe that the computation of a single costs ||z/||o multiplications, and on 
the other hand, for all n G A„, 

2lhllo < JJ + ; h < z/} < #(A„) = u, (5.18) 

iGsupp(i/) 

since A„ is downward closed. The total cost of identifying A„ is therefore at most of the order 
log(rz,) which is generally negligible compared to the computation of the approximation 
polynomial. Indeed, the latter involves n elements from the space V}, and has therefore 
complexity at least nNh which, in the practice of reduced modeling, is much larger than 
log(n) since Nh 3> n. 
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5.4 Reduced basis methods 

A second class of numerical methods is not based on polynomial approximations. Instead, it 
directly seeks choices of functions ui,... for which the approximation of the parametric 
PDE in the resulting n-dimensional space Vn '■= span{ni,... performs almost as good 
as the optimal benchmarks for separable approximations. Recall that these benchmarks are 
measured by n-width dn{M.)v for the uniform error, or by the tail of the singular values 
fll.48p for the mean-square error. 

We discuss in §7 the reduced basis method which targets uniform error estimates, and 
which consists in generating Vn by a selection of n particular solution instances u{a^) for 
i = 1,... ,n, chosen from a very large set of potential candidates. The selection process is 
critical for the success of this algorithm, and one main result is that a certain greedy strategy 
meets the benchmark of the n-width in the sense that it results in similar convergence rates. 

Another representative of this second class of methods, which we do not discuss in this 
paper, is known as the proper orthogonal decomposition method and targets mean square 
estimates. It builds the functions {ui,... ,n„} based on an empirical approximation of the 
exact covariance operator fl 1.441) using a sufficiently dense sampling of the random solution 
u{a). 

One main disadvantage of both reduced basis and proper orthogonal decomposition meth¬ 
ods, compared to the hrst class of methods based on polynomial approximation, is that their 
offline stage is potentially very costly, especially in high parameter dimension. However, 
their potential gain is in that they can get signihcantly closer to the optimal benchmarks for 
separable approximations. This is due to the fact that the best n-term polynomial approx¬ 
imation error may in some cases decay substantially slower than the n-width, as discussed 
in S21 

6 Sparse polynomial interpolation 

In this section, we discuss the construction of polynomial approximations to the solution map 
y ^ u{y) by interpolation. We place ourselves in the same framework as in §3: we consider 
a parametric problem of the form fll.ip such that Assumption A holds for a suitable affine 
representer {'ipj)j>i, so that the solution map y i—)■ «(?/) := u{a{y)) is then well dehned from 
U to V. 

Given A C J- with #(A) = n, we say that a discrete set 

TCU, #(r) = n (6.1) 

is unisolvent ioT Pa if and only if for any values ^ there exists a unique polynomial 

TT e Pa such that 

'^il) = 7 e T. (6.2) 

In such a case, to any real valued function v dehned over U, we associate its interpolation 
polynomial I\v G Pa which satishes 

lAvi'y) = v{'y), 7 G T. (6.3) 
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The interpolation operator I\ is a linear map from the space of real valued functions dehned 
over U onto Pa- It may be written in the usual Lagrange form 

Iav = (6.4) 

7Gr 

where the £a ,7 G Pa are uniquely defined by = ^7,7 for 7)7 ^ T- 

By a standard vectorization procedure, we may define a similar interpolation process that 
maps the space of 17-valued functions defined on U onto the space Va. This amounts in now 
using the 17-valued v{'j) in the definition of the interpolant by fl6.4l) . With a slight abuse 
of notation, we again denote by Ia this operator. From exactly or approximately computed 
instances 

M7 = m( 7), 7 e T, (6.5) 

of the solution map, we may thus compute Jam G Va such that 

lAui^) =Uy, 7 G r. (6.6) 

One of the main attractions of interpolation, also sometimes refered to as collocation in the 
context of parametric PDFs [ainiizs] is that it is a non-intrusive process. 

In addition to the existence and uniqueness of the interpolation polynomial, we point out 
two other properties of the interpolation process that are of interest to us: 

(i) Stability: one typical way of quantifying the stability of the interpolation process is 
through its Lebesgue constant. If T C 17 is a set of unisolvent interpolation points for 
Pa with Lagrange basis elements £a, 7) the Lebesgue constant is defined as 

\\Iau\\l°°{U) S^IC ( \\ (a- 7 \ 

La:=sup-;— -= max> Fa, 7(2/)|- (6.7) 

\ML--iu) ^ 

where the hrst supremum is taken over all non-zero real valued functions u which are 
everywhere defined and uniformly bounded over U. It is easily seen that we obtain 
the same quantity if we instead take the supremum over the set of 17-valued functions, 
using the L°°{U, V) norm in the quotient. The Lebesgue constants typically grow with 
the number n of interpolation points, however it is well known that this growth strongly 
depends on the selection of points. For instance, on the univariate interval [—1,1], the 
Lebesgue constant for interpolation by polynomials of degree n — 1 at n points grows 
exponentially with n for uniformly spaced points and logarithmically for Chebychev or 
Gauss-Lobatto points. 

(ii) Progressivity: we would like to use sequences (A„)„>i of index sets which have the 
nestedness property A„ C A^+i in order to define polynomial spaces with increasing 
accuracy. The sets A„ may be dehned a priori, based on the analysis of best u-term 
polynomial approximations presented in §3, or adaptively generated. In both cases, 
it is desirable that the polynomial interpolation operators /a„+i can be derived in a 
simple way from /a„+i- This requires in particular that the associated unisolvent sets 
of points (F„)„>o are themselves nested. 




It was shown in [18] that such progressive interpolation processes can be derived pro¬ 
vided that the sets A„ are downward closed. We present this approach in ^6.11 and discuss 
its stability properties in ^6.21 We hnally discuss in ^6.31 the computational cost of such 
interpolation schemes, taking into account the space discretization for the computation of 
the instances u{'y), for example using a hnite element method. 


6.1 Sparse interpolation using downward closed sets 

We describe the construction of the interpolation operator for real valued functions, since, 
as previously explained, it induces a similar interpolation operator for Id-valued functions. 

We begin by discussing progressive constructions in the case of univariate polynomial 
interpolation. The starting point is any sequence 


T — (tfc)fc>o, (6.8) 

of distinct points from [—1,1]. We introduce the abbreviated notation 

h := (6-9) 

for the univariate interpolation operator associated with the fc-section of this 

sequence: for any function u dehned everywhere over [—1,1], the polynomial I^u G 
satishes 

Iku{ti) = u{ti), i = 0,...,k. (6.10) 

We can express Ik in a hierarchical form 

k 

IkU = IqU + ^Aiu, (6.11) 

1=1 

also commonly known as the Newton form. We set /_i = 0 so that we can also write 

k 

hu = '^Aiu. ( 6 . 12 ) 

1=0 

Since IkU and Ik-iU agree at the points {to, • • •, 4-i}, h is readily seen that, for k > 0, 


Aku(t) = akhkit), 


(6.13) 


where 


Q-k . u(tk^ iw(tfc). 


is the error at tk of interpolation by Ik-i, and 


hkit) : = 


n 


t-ti 
tk ti 


(6.14) 


(6.15) 
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We also set 


ho{t) := 1. (6.16) 

For all fc > 0, the system {ho,..., hk} is a basis for P^, sometimes called a hierarchical basis. 

Although the sequence T could be arbitrary, the stability of the resulting interpolation 
scheme, as reflected through the growth of it Lebesgue constants, depends very much on 
the choice of T. One interesting choice is the sequence of the so-called Leja points, which is 
initiated from an arbitrary to (usually taken to be be 1 or 0) and recursively dehned by 

k-l 

4 := argmaxj |t - til : te[-l,l]|. (6.17) 

/=o 

With this particular choice, we note that that the hierarchical basis functions satisfy 

||hfc||Loo([_i^i]) = 1, k>0. (6.18) 

The Leja points may be viewed as an incremental variant to the classical Fekete points 

• • • ,4,fc} := argmaxj JJ - 4| : {4, • • •,4} e [-1,1]^^^}, (6.19) 

which, in contrast to the Leja points, are not fc-sections of a single sequence. 

We turn now to the multivariate setting. Starting again with the univariate sequence T, 
we now dehne the points 

Uu ■= ituj)j>i e f7, ly e T, (6.20) 

which are therefore extracted from the tensorized grid T^. We also dehne the tensorized 
operators 

lu := ( 6 . 21 ) 

Recall that the application of a tensorized operator ^j>iAj to a multivariate function 
amounts in applying each univariate operator Aj by freezing all variables except the j- 
th and then applying Aj to the non-frozen variable. We may dehne 4 by induction. For 
this, let us introduce iFk the set of all u such that Uj = 0 for j > k. 

• For k = 1, there is only u = 0 the null multi-index contained in Tq. Then lou is the 
constant function with value u{yo), where yo = (to, to, ■ ■.). 

• For k > 1, assuming that 4 has been dehned for any h G iFk-i, and taking o G J-k, we 
write 

i/=(i/i,h), z>= (z/ 2 , 1 / 3 ,...) G4fc-i, (6.22) 

and for any y E U, 

y = {yi,y), y = {y2,y3,---)- (6.23) 

We then dehne 4 := lui ® 4, that is, 

hu(y) = hvy^(y), (6.24) 

where Vy.^(y) := 4i4(|/i) with Uy the univariate function dehned on [—1,1] by Uy(t) = 

u(y). 
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Note that in the hnite dimensional case U = [—1,1]'^, this induction terminates after at most 
d steps. 

It is easily seen that is the interpolation operator on the tensor product polynomial 
space 

Pj. = ®j>iPj.^., (6.25) 

for the grid of points 

(6.26) 

which is unisolvent for this space. This polynomial space corresponds to a particular set A 
which has rectangular shape. Namely A = R^, where, for any z/ G -F, we dehne the shadow 
of u as 

-.= {u : u < u}. (6.27) 

We thus have Pj, = Pi^^. 

We next dehne in a similar manner the tensorized difference operators 


:= (6.28) 
It follows that the range of is the one dimensional subspace of P^ spanned by 

Hu{y) ■= JJ KjiVj), jy e R. (6.29) 

To a general hnite set A C F", we associate the operator 


ia ■— At,, 

ueA 


(6.30) 


and the grid 


rA := ■■ A}. 


(6,31) 


In the case where A = R^, we hnd that Ta = Ty. It is thus unisolvent for Pa. In addition, 
we then have 

h = ®j>i (5^ A;) = ^ Ap = Ja, (6.32) 

1=0 u<u 

which shows that I\ is the interpolation operator onto Pa for this grid. 

Let us remark that for a general set A, the set Ta is not unisolvent on Pa and Ja is 
not an interpolation operator. However, an important observation is that this is the case 
whenever A is an arbitrary downward closed set. This fact was hrst noticed in [59] for 
bivariate functions, and then used in higher dimensions for particular cases of downward 
closed sets in [55] . 


Theorem 6.1 Let A <Z R he a finite downward closed set. Then, the grid Ta is unisolvent 
for Pa and R is the interpolation operator onto Pa for this grid. 


Proof: Because of the downward closed set property, C Pa for all i/ G A. Hence the 
image of Ja is contained in Pa- In order to prove that it is the interpolation operator for the 
grid Fa, we need to show that, for any function u defined over U, 

Ihuiyy) = u{yy), p e k. (6.33) 

Since #(rA) = dim(PA) this also ensures the unisolvence of Fa for Pa- 
For any u E A, we may write 


Iau = lyU + ^ /kyU . 


(6.34) 


Since is the interpolant on the tensor product grid F,^, and this grid contains yv, it follows 
that 

Iuu{yy) = u{y^). (6.35) 

On the other hand, if F G A is such that F ^ z/, this means that there exists a j > 0 such 
that Pj > Vj. For this j we thus have Aj^u{y) = 0 for aA\ y E U with j-th coordinate equal 
to due to the application of A^. in the j-th variable. Therefore 

Ai^u(y„) = 0. (6.36) 

It follows that Ihuiiiy) = uiy^) which concludes the proof. □ 

The decomposition fl6.30p of Ja as a sum of the various Ay may be viewed as a general¬ 
ization of the Newton form fl6.1ip . This decomposition also yields a simple strategy for the 
fast computation of Jam that we now describe. 

We first observe that if A is a downward closed set of cardinality n > 0, we can find at 
least one v E A which is maximal in A, that is, such that 

i> > u and u ^ u F ^ A. (6.37) 

We may then write 

A = AU{z/}. (6.38) 

where A is a downward closed set of cardinality n — 1. Writing 

Iau = I^u + AyU, (6.39) 

we observe that Ay is characterized by the fact that it belongs to Fy and is characterized by 

Ayu{yy) = 0, 9 EVy- {u} , (6.40) 

and 

Ayu{yy) = lAu{yy) - I},u{yy) = u{yy) - (6.41) 
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Using the tensorized hierarchical basis function Hj,, it follows that 

A^u = ay = ay{u) := u{yy) - IpuiVv)- (6.42) 

By iteration, we may write A = , z/”}, where the z/® are enumerated in such way that 

for each i, Aj = {z^^,..., z/®} is a downward closed set. This allows us to compute Ja by n 
recursive applications of 

IK^u = + ayiHyi. (6.43) 

Note that is a basis of Pa and that any v e Pa has the unique decomposition 

f ^ ayHy, (6.44) 

uGA 

where the coefficients = ay{v) are dehned by the above procedure applied to v. Therefore, 
although the enumeration {z/^,..., z/®®} is not unique, the coefficients a^ = au{u) in the 
expression 

Iau = ^ ayHy, (6.45) 

i/eA 

are unique. Also note that ay{u) does not depend on the choice of A but only on u and u. 
This computation is exactly the same in the case of U-valued functions, now with uniquely 
dehned coefficients a^ E V. 

The recursive computation of the interpolation operator by (16.431) can be used in two 
different contexts: 

• Non-adaptive methods: a nested sequence (A„)„>o of downward closed sets is pre¬ 
scribed in advance, and we use fl6.43p to compute /a„w for increasing values of n. 

• Adaptive methods: the sequence (A„)„>o is not prescribed in advance, and we use the 
computation of Ia^u to dehne A„+i. 

We next give a typical example of an adaptive interpolation algorithm. In order to present 
this algorithm we begin by an analogy: since we have 

Iau = '^ayHy, (6.46) 

u&A 

we may view the interpolant as a truncation of the formal inhnite expansion of u in the 
hierarchical basis 

Y^ayHy, (6.47) 

From elementary results on polynomial interpolation, we know that this series does not 
converge for a general function dehned everywhere over U. Even for the various models of 
parametric PDEs discussed in this paper, we don’t know natural conditions that would ensure 
the unconditional convergence of this expansion towards m, in contrast to the Taylor and 
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Legendre series discussed in §3. In particular we do not know estimates for the coefficients 
ay which would allow us to establish convergence rates for the best n-term truncations. 

Nevertheless, we may still take the same view as in §3, and use for A„ the set of indices 
corresponding to the n largest terms of fl6.46p measured in some given metric Lp{U,V, fi). 
We take p = oo if we search for uniform approximation estimates or p = 2 if we search for 
mean-square approximation estimates. This amounts to choosing the indices of the n largest 
Cl/II tty IIy, where Cy is given by 

Cy := \\Hy\\ip{jj^^y (6.48) 

In the case where jj, is the uniform measure, we also have 

Cy ||hy^ ||£^pQ_]^ (6.49) 

i>i 

Note that in the case where p = oo and if we use the Leja sequence, we are ensured that 
\\Hy\\L°°{u) = 1 and therefore this amounts to choosing the largest ||tty||y. The defect of this 
strategy is that the sets A„ are not ensured to be downward closed. In addition, we generally 
cannot afford an exhaustive search for the n largest contributions in (16.461) . 

In order to build a feasible adaptive algorithm, we need to limitate this search. In what 
follows, we describe a greedy algorithm proposed in [18] for the selection of the sequence 
(A„)„>i, which uses the set of neighbors N{K) defined by fl5.13p . We first give an idealized 
version of this algorithm which cannot be applied as such. 

Greedy Interpolation Algorithm: We start with Ai := {0} the null multi-index. Assum¬ 
ing that A„_i has been selected and that the (tty)i.gA„_i have been computed, we compute 
the ay for z/ G A(A„_i). We then set 

y” := argmax{cy||ay||y : v G A(A„_i)}, (6.50) 

and define A„ = A„_i U {//"■}. 

Note that when p = oo and T is the Leja sequence, this strategy amounts to choosing 
the y G N{An-i) that maximizes the interpolation error at the new grid point which would 
be added by adjoining y, that is, setting 

i/” := aTgmax{\\u{yy) - lA„_Myu)\\v ■ ^ e A(A„_i)}. (6.51) 

The above greedy algorithm is not computationally feasible since we are in principle working 
with infinitely many variable (pj)j>i, in which case the set of neighbours A^(A) to be explored 
has infinite cardinality. One way to circumvent this defect is to replace in the algorithm the 
inhnite set A(A„) by the finite set of anchored neighbors A(A„) defined by fl5.14p . 

One more serious defect of this algorithm is that it may fail to converge, even if there 
exist sequences (A„)„>o such that fastly converges towards u. Indeed, if it happens that 
/AyU = 0 for a certain z/, then no index u > u will ever be selected by the algorithm. As an 
example, consider a two dimensional function of the form 

uiy) = Ui{yi)u2iy2), (6.52) 
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where Ui and U 2 are non-polynomial smooth functions such that M2 (to) = ^2(^1). Then the 
sets A„ selected by the algorithms will consists of the indices z/ = (fc,0) for/c = 0 ,...,n — 1, 
since the interpolation error at the point (tfc,ti) always vanishes. One way to avoid this 
problem is to change the strategy by alternating the selection of using fl6.5ip with a se¬ 
lection rule ensuring that all indices are eventually picked. For example, when n is even, we 
define z/” according to fl6.51l) . and when n is odd we pick for z/” the multi-index v G iV(A„) 
which has appears at the earliest stage in the neighbors of the previous sets A^. In summary, 
this results in the following algorithm. 

Alternating Greedy Interpolation Algorithm: We start with Ai ;= {0} the null multi¬ 
index. Assuming that A„_i has been selected and that the (ay)i/GA„_i have been computed, 
we compute the for z/ G A(A„_i). We set, if n is even, 

z/” := argmax{C[/||a,y||y : z/ G iV(A„_i)}, (6.53) 

and, if n is odd, 

z/" := argmin{/c(z^) : u G iV(A„_i}, k{v) := min{/c : v G iV(Afc)}. (6.54) 

We then define A„ = A„_i U {z/"}. 

Even with such modifications, although the adaptive algorithm seem to behave well in 
many practical instances, the convergence of the interpolation produced by this algorithm is 
still not guaranteed. It is an open problem to understand which additional assumptions on 
u ensure convergence, and more importantly a convergence rate that is comparable to that 
which is proved for best n-term approximations based on Taylor and Legendre series. Note 
that the solution to this problem need to involve the initial choice of the univariate sequence 
T, which, as discussed in the next section, strongly affects the stability and convergence 
properties of the interpolation process. In the next section, using this stability analysis, we 
establish convergence rates for the interpolation algorithm, however based on non-adaptive 
choices of the sequence (A„)„>i. 

Remark 6.2 A very similar greedy algorithm was proposed in [3H] in the slightly different 
eontext of adaptive guadrature, that is, when we want to approximate the integral of u over 
the domain U rather than u itself. In that case, one natural choice is to pick the new neigbor 
V that maximizes the absolute value of the integral o/Aj^m. 

Let us conclude this section by mentioning that there exists several natural generalizations 
to the above described construction of the sparse multivariate interpolation process. 

The hrst obvious one is that we could work on more general tensor product domains of 
the form 

U = ®j>oUj, (6.55) 

where the Uj are univariate intervals or other bounded domains in M or C, and define points 
yy by tensorization of sequences 

Tj = {tj,k)k>o, (6.56) 
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of pairwise distinct points, each of them picked from Uj. 

The second generalization is that we could start with univariate systems other than 
polynomials that still having a hierarchical interpolation structure. We consider a general 
index set S equiped with a partial order < and assume that there exists a root index 0 such 
that 0 < 7 for all 7 G iS. Given a grid of pairwise distinct points G = we say that a 

family of functions dehned over [— 1 , 1 ] is a hierarchical basis associated to the grid 

G if and only if ho(f) = 1 and 

= 1 and = 0 if 7 < 7 and 7 7 ^ 7 . (6.57) 

By tensorization, we obtain an index set T C of hnitely supported sequences, equiped 
with a partial order < induced by its univariate counterpart. This allows us to dehne 
downward closed sets in in a the same way that we have for the particular case G = N. 
For V = (zzj)j>i G T^ we also dehne the points yy & U and the tensorized hierarchical 
functions Hy in the same way as in fl 6 . 20 l) and fl6.29l) . Then, if A is a downward closed set, 
we may inuctively dehne an interpolation operator Ja onto the space 

:= spa.n{Hy : v G A}, (6.58) 

associated to the grid Ta, using the same recursion 

Iau = I]^u +ttyHy, ay ■.= ay{u) = uiyjy) - Ixu{yy), (6.59) 

where z/ ^ A and A is a downward closed set such that A = A U {i^}. We initialize this 
computation for A = {0}, where 0 is the null multi-index, by dehning /{o}M as the constant 
function with value u{yo). Examples of relevant hierarchical systems include the classical 
piecewise linear, or more generally piecewise polynomial, hierarchical basis functions. With 
such choices the spaces Ha include as particular cases the well-studied piecewise polynomial 
sparse grid spaces, see uni for a survey on this topic. 

6.2 Stability 

We now turn to the stability analysis of the interpolation operator. We recall the Lebesgue 
constant dehned in fl6.7p . One principal interest of the Lebesgue constant is that it allows us 
to estimate the error of interpolation in terms of the best polynomial approximation error 
in the L°° norm. Indeed, for any u G L°°{U, V) and any u G Va we may write 

\\u — Iau\\l<=^(u,v) < ||m — v\\l^(u,v) + \\Iav — Iau\\l°°(u,v)-, (6.60) 

which yields 

\\u - lAu\\L^mv) < (1 + La) inf \\u - v\\L^mv), (6.61) 

vgVa 

by taking the inhmum over Va- 
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We know from the results in ^3.81 in particular Corollary 13.261 that for relevant classes 
of parametric PDEs, we can hnd nested sequences of downward closed sets (A„)„>i with 
#(A„) = n, such that 

inf \\u — v\\L°°mv) ^ Cn~^, (6.62) 

where s > 0 is some given rate. This holds in particular with s ^ — 1 if the assumptions 
of Theorem 12.91 hold and if in addition (||'0illx)j>i G Therefore we have the error 

bound 

||m - I\^u\\L^(uy) < C{1 + LA„)n“C (6.63) 

if we use such sequences in the interpolation process. This motivates estimating the growth 
of La with n. 

In order to estimate La, we introduce the univariate Lebesgue constants 




sup 




(6.64) 


where the supremum is taken over all non-zero real valued functions u that are everywhere 
dehned and uniformly bounded on [—1,1]. We define an analogous quantity for the difference 
operator A^, namely 


6k := sup 




(6.65) 


and observe that 


6k A -|- Afc, fc > 0, 


( 6 . 66 ) 


where we have set A_i = 0. We introduce for each z/ G the quantities 


6i, := sup 




(6.67) 


so that we have, on the one hand 

6u A A (6.68) 

i>i t>i 

and on the other hand 

LaA^^j/- (6.69) 

i/GA 

The following result from [18] gives an estimate on the growth of La in terms of #(A), 
provided that a similar estimate holds for the univariate Lebesgue constant A^ or for the 
quantity 6k. 

Theorem 6.3 If either one of the estimates 

>^k<{k + lY, k>0, (6.70) 
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or 


(6.71) 


< (^ + 1 )^, > 0 , 


holds for some 6 >1, then the Lebesgue constant La satisfies 


La < (#(A))'+' 


(6.72) 


for any downward closed set A. 

Proof: The case where 06.711) holds is elementary since the hrst inequality in 06.68P yields 

K < + 1)" = (lit''.' + l))” = »(«-))" S (#(''))"• (6-73) 

i>i i>i 

where we have used the fact that <Z A since A is downward closed. Using 06.69P we thus 
obtain 06.72p . 

For the case where O6.70p holds, we observe that 

Afc + Afe_i <{k + 1)^ + k^ < {2k + l){k + l)^-h (6.74) 

The second inequality in fl 6 . 68 p yields 

j>i i>i 

i>i 

<(#(A))»-'n(2>'y+i). 

i>i 

In order to establish fl6.72p . it thus suffices to prove that (t(A) < (#A)^, where 

■=En(2>'l + l)- (6-75) 

j>l 

For this, we use induction on n := fi{A). For n = 1 and A = {0} the result obviously holds. 
Assuming that it holds for some n > 1, we consider a downward closed set A of cardinality 
n + 1. We may assume without loss of generality that z/i 7 ^ 0 for some z/ G A, and denote 
by Tf > 1 the maximal value attained by the coordinate ui when 1 / G A. For 0 < A; < iF, we 
dehne 

Afc := {z> = (z/ 2 , 1 / 3 ,...) : {k, z>) G A} (6.76) 

Each of the set A^ is downward closed and, since iF > 1, we have #(Afc) < #(A) for all 
k = 0,..., K. The induction hypothesis implies 

K K K 

= E E np'-j + 1) = EP*: + 1)‘"(A») < EP^ + (6'77) 

k=0 j>l k=0 k=0 
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Also, we have 


Ax C • • ■ C Ai C Ao, 


(6.78) 


since for /c>l,z/GAfc=^(fc, i/)gA=^(/c — 1,i/)gA^i/G Ak-i. This implies 


k-l 


/c(#(Afc))^ < #(Afc) y^#(Aj), 

j=0 


(6.79) 


and therefore 

K K k-l K 2 

<t(A)<^(#(A,))" + 2^#(A»)^#(A,) = (^#(A,)) =(#(A))^ (6.80) 

k=0 k=0 j=0 k=0 

which concludes the proof. □ 

Remark 6.4 One noticable feature of the above result is that the bound on La only depends 
on the cardinality of A. In particular, it is independent of the number of variables, which 
can be infinite, as well as of the shape of A. 

In view of the above result, we are therefore interested in choosing univariate sequences 
T = (tk)k>o such that the Lebesgue constant Xk or the quantity 6k have moderate algebraic 
growth with k. It is well known that for particular sets of points such as the Chebychev 
points 

Ck := + ^ = (6-81) 

or the Gauss-Lobatto (or Clemshaw-Curtis) points 

Gfc := I cos(^^7rj ; / = 0, ...,/c|, (6.82) 

the Lebesgue constant has logarithmic grows Xk ~ log(fc), therefore slower than algebraic. 
However these points are not adapted to our construction since the sets Ck and Gk are not 
nested as k grows, and therefore are not the /o-sections of a single sequence. 

For the Leja points defined by fl6.17l) . numerical computations of Xk for the hrst 200 
values of k indicates that the linear bound 

Afc < (1 + k), (6.83) 

seems to hold and that this bound could be sharp. However there is currently no rigorous 
proof supporting this evidence or establishing another algebraic rate. Nevertheless, Leja 
points seem to be a good choice for the construction of our multivariate interpolation process. 

Leja points have also been considered on the complex unit disc {|. 2 | < 1}, taking for 
example to = 1 and using again the recursion (I6.17p . now with | • | standing for the modulus. 
These points have the property of accumulating in a regular manner on the unit circle 
according to the so-called Van der Corput enumeration m- Their projections on the real 


97 







axis are called the 3f?-Leja points, and coincide with the Gauss-Lobatto points for values of 
k of the form 2”' + 1 for n > 0. The growth of the Lebesgue constant \k has been studied 
in [m [121 [la [B] for these two families of points. In the case of the complex Leja points, 
this constant is dehned as in fl6.64p . however taking the supremum over functions dehned 
everywhere and bounded over the complex unit disc. It is proved in [T3] that the linear 
bound fl6.83p holds for the complex Leja points. For the 3fJ-Leja points, quadratic bounds of 
the type 

Afc<C'(l + /c)^ and (jfc < (1 + (6.84) 

with C > 1 are established in [15] . 

With such estimates, application of Theorem 16.31 gives us bounds of the form 

La < (#(A))'+^ (6.85) 

for example with 6 = 2 when using the 3?-Leja points. If we combine this bound with fl6.63p . 
we obtain the convergence estimate 

11^ — L°°{u,v) < Cn ^ (6.86) 

which expresses a deterioration of the convergence rate when using the interpolation process 
instead of the truncated expansions studied in §3. 

We now present a sharper analysis, introduced in [18], which reveals that this deteriora¬ 
tion actually does not occur for the models of parametric PDFs which are of interest to us. 
This analysis is based on the following Lemma which gives an estimate of the interpolation 
error in terms of the tail of the Legendre coefficients of u multiplied by algebraic factors. 

Lemma 6.5 Assume that the Legendre expansion fl3.26p of u is unconditionally convergent 
in L°°{U, V). If the univariate seguence T = {tk)k>o is such that that fl6.70p or fl6.7ip holds 
for some 9 >1, then, for any downward closed set A, 

\\u - Iau\\loo(u,v) < ‘2'^PAb)\\wu\\v , (6.87) 

v^A 

where b := 6 + 1 and 

PAb) ■■= 

i>i 

Proof: The unconditional convergence of the Legendre series allows us to write 

Iau = I A j ^ wJaPv = ^ WyPy + ^ wJaPu, (6.89) 

u&T ^gA u^A 

where we have used that Pj, G Pa because A is downward closed and hence IaPu = Pu for 
every u E A. For the second term, we observe that for each z/ ^ A, 

IaPu = Aj^Pu = AyPu = lAnR^Pu-, (6.90) 

u&A PGAnHi. 
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since = 0 whenever u u and f G Pjy. Therefore 

M - Jam = ^ w^{I - lAnizjPu, 

u^A 

where I stands for the identity operator. This implies 


\u — Iau\\l^(u,v) + lhAnR„)ll'«^y||y <2^LAn_R„ 

u^A u^A 


\Wi,\\v ■ 


Since fl6.70p or fl6.7ip holds, we obtain from Theorem 16.31 that 

Laok. < #(A n < #(7?.)^+' = P.(6), 

which completes the proof. 


(6.91) 


(6.92) 


(6.93) 

□ 


The estimate for the interpolation error in Lemma 16.51 is very similar to that of the 
trunctated Legendre expansion, up to the presence of the factor Pu{b). We may therefore use 
the same techniques as those used in §3 for this expansion in order to establish convergence 
rates for the interpolation error. We hrst establish a summability result which is analogous 
to Theorem 13.251 


Theorem 6.6 Consider a parametric problem of the form fll.ip such that Assumption A 
holds for a suitable affine representer {'ipj)j>i. If the assumptions of Theorem \2.9\ are satisfied, 
and if in addition (||'0il|A:)j>i G 7^(N) for some p < 1, then {j)u{h)\\wi,\\v)u&T G 
the same value of p, and for any b > 0. 


Proof: This proof is very similar to the proof of Theorem 13.251 and so we only sketch it. We 
obtain a similar estimate 

Pu{b)\\wu\\v < hj, := rEffi)rFffi), (6.94) 

where has exactly the same form as in fl3.167p up to a change in the multiplicative constant 
Co, and rp is now given by 


frffi) 


Ck (1 + 2 yj)^^^ + 

jSFnsupp(i/) 


eui 


Sbjlopl 


(6.95) 


By a similar reasoning, up to a modihcation in the choice of J and ffi one then shows that 
the sequence belongs to and that it is monotone non-increasing. □ 


Combining this result with Lemma 16.51 we obtain the following corollary which shows 
that the interpolation process converges without deterioration of the rate established for the 
Legendre series. 
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Corollary 6.7 Consider a parametric problem of the form fll.ip such that Assumption 
A holds for a suitable affine representation fll.151) . Assume that the univariate sequence 
T = (tk)k>o is such that that fl6.70p or 06.711) holds for some 9 > 1. Then, if the assumptions 
of Theorem \2.9\ are satisfied, and if in addition (||V'il|x)j>i G for some p < I, there 

exists a sequence of nested downward closed sets (A„)„>i such that = n and such that 

\\u - Ia„u\\loo,uy) < Cn~^, n>l, s - 1. (6.96) 

p 

6.3 Space discretization and computational cost 

In the practice of numerical computation to parametric PDEs, as explained in §5, we replace 
the instances u{ip,) by their approximation UhiVu) £ 14 computed by a numerical solver. We 
denote by lA,hU the resulting interpolation polynomial, which belongs to the space 


VA,h ■— Pa ® 14- 

(6.97) 

In other words, we have 


7a, • I 

(6.98) 

where 


Uh-y^ Uh{y), 

(6.99) 

is the approximate solution map acting from U into I4. 



We begin by discussing the accuracy of this polynomial approximation. For a given set 
An, one way to estimate the total interpolation error is by writing 


14 — lA„,hU\\L°°{U,V) 4 14 ~ Ia„u\\l°°{U,V) + I4a„(m — Uh)\\L°°(U,V)- (6.100) 

The hrst term is estimated by the results in the previous section, such as Corollary 16.71 and 
for the second term we may write 

l4An(“ ~ < IbA„£(h), (6.101) 

where e{h) is the acuracy of the numerical solver. Under the assumptions of Corollary 16.71 
this result in an error estimate of the form 

14 - -^A„,hM|4°°(f/,v) < Cn”® + n®+4(h), (6.102) 

where we have also used Theorem 16.31 for bounding La„. 

There is a more efficient way to estimate the error in the case where the approximate 
solution map Uh may be viewed as the solution map of a discrete parametrized problem of 
the form 

Vhiuh,a) = 0, (6.103) 

where Vh '■ Vh^X ^ W and with similar properties as the original parametric problem fll.ip . 
Consider for example the case of the elliptic equation fll.51) and its Galerkin discretization 
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on Vh defined by fl5.3p . It is then readily seen that the discrete solution map a e-)■ Uh{o) has 
the same boundedness and holomorphy properties as the original map a u{a). In turn, 
we obtain the same estimates for the Taylor or Legendre coefficients of the solution map 
y 1 -^ Uh{y) '■= u{a{y)). This type of problem is therefore covered by the following discrete 
counterpart to Corollary 16.71 

Corollary 6.8 Consider a parametric problem of the form (16.103p such that Assumption 
A holds for a suitable affine representer {ijj)j>i- Assume that the univariate sequence T = 
(tk)k>o is such that that fl6.70p or fl6.7ip holds for some 9 > 1. Then, if the assumptions 
of Theorem \2. 9\ are satisfied by the map a Uh{a), with the open set O and the bound in 
fl4.4p independent of h, and if in addition (||'0illx)j>i ^ for some p <1, there exists a 

sequence of nested downward closed sets (A„)„>i such that #(A„) = n and such that 

\\uh-lAr,Uh\\L^{u,Vh) n>l, s (6.104) 

p 

where the constant C is independent of h. 

Under the assumptions of the above corollary, we may now estimate the total interpola¬ 
tion error by writing 

Wu — lA,„hU\\L^{U,V) < IWh — ~ 'Oh\\L°°{U,V), (6.105) 

which yields the bound 

\\u - < Cn~^ + e{h). (6.106) 

This bound is clearly more favorable than fl6.102p . 

We next turn to the estimate of the computational cost, starting with the offline cost. 
Assuming that A„ is given, we want to pre-compute the coefficients ■= Oi^iuh) G 14 in 
the expression 

lA„,hU = ^ ai,^hHy. (6.107) 

I'GA.n 

We begin by computing the discretized instances Uhijju) for z/ G A„, as vectors of size of 
their coefficients in a given basis of 14. This has a cost of order nCh where Ch is the individual 
cost of one application of the discrete solver. The coefficients ay^h are then computed by 
recursive application of the formula (16.431) based on these discretized instances. At the stage 
i of this recursion, the coefficient •= oiyi{uh) is computed by a linear combination of 
the i — 1 previous ones, according to 

i—1 

O^U’-,h ^ ^ CXyl kiHyl(^yyt.f (6.108) 

1 = 1 

Note that in view of the definition of the Hy, we can compute Hy{y) for any y E U in 
|z/| = 4 multiplications. Therefore the total cost in this stage is bounded by 

i i 

iNh ^ |4| = iNh + ^(/ - 1) = iNh + i{i - l)/2. (6.109) 

i=i i=i 
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Summing over i = 1,..., n, we thus find that the total offline cost is of order 

Coff ~ nCh + n^Nh + ~ nCh + n^Nh. (6.110) 

Here we neglect, the cost relative to the computation of the Hi^i{y^i) as well as the cost 
of n? log(n) needed for the non-adaptive selection of the sets A„, as derived in ^5.31 since 
is typically much larger than n. 

We finally evaluate the online cost. Since the online stage simply amounts in the combi¬ 
nation of the for computing the interpolant, we find that this cost is of the order 

Con ~ nNh, (6.111) 

where we have neglected the cost relative to the computation of the Hy{y) for the given 
yeU. 

If £ is a targeted order of accuracy, and if we have the error bound fl6.106p . then one way 
to reach this accuracy is to take both Cn~^ and e{h) of the order of e. Denoting by h{e) the 
inverse function of e{h), that is, 

h(£o) = ho e{hQ) = Eq, (6.112) 

we thus find that the interpolation algorithm reaches the order of accuracy e at costs 

Cofr(£) ~ -1- e~^/^Nh(e) and Con{e) ~ (6.113) 

It should be noticed that this algorithm is immune to the curse of dimensionality since the 
above trade-off between accuracy and complexity is obtained with infinitely many variables. 

7 Taylor approximation 

The results established in §3 show that effective polynomial approximations fll.SOp to the 
solution map y u{y) can be obtained by best n-term truncations of the Taylor series 
fl3.2Up . for relevant classes of parametric PDEs. 

In this section, we discuss a strategy, proposed in na, for numerically finding a good 
n term Taylor approximation to u. This numerical method rests in part on the effective 
computation of the Taylor coefficients ti^. In contrast to the interpolation method discussed 
in §5, the strategy for n-term Taylor approximations is intrusive and strongly exploits the 
specific structure of the parametric PDE. In fact, it only applies to the limited, yet relevant, 
range of problem where the parametric problem has the form of a linear operator equation 

B{a)u = /, (7.1) 

where f E W and B{a) G CiV, W) for a pair of Hilbert spaces {V, W), and where 

a^H(a), (7.2) 
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is a continuous linear map from X to £(V, W). In other words, the problem map 

V : {u, a) ^ f — B{a)u, (7.3) 

is linear both in a and m, up to the constant term /. Recalling our four examples of parametric 
PDEs discussed in §2, that is, equations (11.51) . (12.401) . (I2.58p . and (I2.66p . the first two fall in 
this category while the last two do not. 

Any linear parametric problem between Hilbert spaces V and W can be expressed through 
a variational formulation fl2.38p for a pair of Hilbert spaces (V, V) where V is the antidual 
of W. In our present setting, this formulation is: hnd u{a) G V such that 

B{u{a),v; a) = L{v), u G H, (7.4) 

where and L are continuous sesquilinear and antilinear forms over V xV and V 

respectively, and where, throughout this section, we make the additional assumption that 

a ^ B{-, •; a), (7.5) 

is a continuous linear map X to 03 the set of continuous sesquilinear forms over V x V. We 
work under the following assumption. 

Assumption AL: The parameter set A has a complete affine representer and the 

sesquilinear form Bf, qa) satisfies the inf-sup conditions fl2.3Up for all a G a{U). 

Notice that this assumption requires that the problem fl7.ip has a solution for any a G 
a{U) and for all / G W, in contrast to Assumption A which requires that that it has a 
solution for all a G a{U) but only for the given / G W. Under this assumption, the solution 
map 

y ^ u{y) = B{a{y))-^f, (7.6) 

is well defined over U. Examples of problems falling in this category are the elliptic equation 
(ll.5p and the parabolic equation (I2.40p under the uniform elliptic assumption UEA(r) as 
discussed in ^2.41 Since all maps in the chain 

a 1 -^ B{a) I—)■ B{a)~^ i—)■ u{a) = B{a)~^f, (7.7) 

are inhnitely Frechet differentiable at a G a{U), and since y i—)■ a{y) is affine, we are ensured 
of the existence of the partial derivatives 



at every y ^ U and for all u G X, and therefore of the Taylor coefficients ti, = fid’^u{0) G V 
for all u E X. 

We first show in ^7.11 that these coefficients can be computed by a simple recursive 
procedure which takes advantage of the linear structure of the problem. When the truncation 
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sets A„ are downward closed, this procedure computes the coefficients (ti/)yeA„ at the cost 
of solving n times a linear problem with operator B := B{a). Similar to the interpolation 
method in §5, the downward closed sets (A„)„>i for which we compute the coefficients ty can 
either be chosen in an a priori manner, based on the a priori estimates for the coefficients 
||ty||y, or adaptively built. Various adaptive selection strategies for finding the sets A„ are 
proposed in ^7.21 and a convergence analysis is given in ^7.31 for one of them in the particular 
case of the elliptic equation fll.Sp : we show that if the sequence belongs to the 

space defined in ^3.81 for some p < 1, then the adaptive strategy generates downward 

closed sets (A„)„>i such that the tructated Taylor series converges in L°°{U,V) with the 
expected rate n~^ where s —1. Recall that Theorem 13.251 shows that f||t.,||v').,c7r G 
whenever the assumptions of Theorem 12. SI hold and (||V'il|v)j>i G £^(N). Space discretization 
and computational cost of the adaptive and non-adaptive strategies are discussed in ^7.41 

7.1 Recursive computations of the Taylor coefficients 

Since B(-, •, a) is defined for all a G X, we may introduce the sesquilinear forms 

^(■r)-■= and Bj(-, ■) := B(-, ■; (7.9) 

Then, the solution map p u(p) = u(a(y)) is dehned by 

B(u(p),v;p) = L(v), veV, (7.10) 

where 

+ ( 7 . 11 ) 

i>i 

Note that B as well as each individual Bj is bounded, that is, belong to 03. In addition, 
we know from Assumption AL that B satishes the inf-sup conditions fl2.30p . and so does 
B{-,--,p) for any p E U. The following result shows that the Taylor coefficients of the solution 
map p u{p) satisfy simple equations which allow us to compute them in a recursive way. 

Lemma 7.1 Consider a parametric problem of the form fl7.ip such that Assumption AL 
holds for a suitable affine representer Then the Taylor coefficients of the solution 

map y eE- u{y) satisfy the equations 

B{ty,v) = Ly{v), veV, (712) 

where Ly{v) := L{v) if u = 0 is the null multi-index, and 

Ly{v) := - ^ Bj{ty_ej,v), (7.13) 

iesupp(y) 

when V E X — {0}, where Cj := (0,..., 0,1, 0,...) is the Kroenecker sequence with 1 at 
position j. 
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Proof: The case i/ = 0 is immediate since to = '^(0) and 5(0) = B. For the other values of 
p, we apply the operator O'' to the equation 

L{v) = B{u{y),v-y). (7.14) 

Since L does not depends on r/, and due to the affine form fl7.11l) of 5(-, •,?/), we obtain by 
the multivariate Leibniz rule 

0 = d''{B{u{y),v]y)) 

= d'' (B{u{y),v) + ^ yjBj{u{y),v)^ 

_ i>i 

= B{d''u{y),v) + ^ yjBj{d'^u{y),v) + ^ UjBj{d''-^^u{y),v). 

j>l jGsupp(i^) 


At y = 0, this gives 


B{d'^u{0),v) = - ^jBj{d'^-^^uiO),v). 

jes\ipp{u) 


(7.15) 


Dividing by z/! = (z/ — this gives fl7.12p . 


□ 


For further purposes, we give the particular expression of the equations fl7.12p in the case 
of the elliptic equation fll.Sp . In this case, V = V = Hq{D) and the sesquilinear forms are 
given by 

B{u,v) ■.= aVu-'Vv, (7-16) 


D 


and 


Bj{u,v) := / ipjVu ■ Vv. 


(7.17) 


D 


Therefore, for z/ G 5 — {0}, the coefficient ty is the solution of the boundary value problem 

■ Vu, V eV. (7.18) 


D 


aVty ■ Vv = — ^ J 

jesuppiu) ^ 


Note that, in the particular case where z/ = ej, that is, when ty = dy.u{0), this relation has 
the form 

J aVdy.u{0) ■Vv = — ^ J 'ipjVu{0) ■ Vv, v eV, (7.19) 


iGSUpp(l/) jy 

and it can be derived from the general expression of the Frechet derivative du{a) obtained 
in §2.1, applied to h = if^j. 

Lemma ITT] shows that if A„ is a downward closed set, then it is possible to compute the 
n Taylor coefficients {ty)y^\^ by solving exactly n linear problems of the form fl7.12p . This 
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is done by writing A„ = {z/^,..., z/""} where the order is such that all sections ,..., i/^} 
are downward closed sets for k = 1,... ,n and by recursively computing the t^, in this order. 
Then, the right side of the problem 07.121) for computing only depends on the t^i for i < fc 
which have already been computed. 

In practice, these linear problems can only be solved approximately, through a space 
discretization, for example using the hnite element method. This induces an error in the 
computation of the Taylor coefficients. We deal with this issue in ^7.41 and assume for the 
moment that these problems are solved exactly. 

As already explained in ^5.31 one non-adaptive strategy consist in dehning A„ as the set 
of indices corresponding to the n largest a priori estimates for the Taylor coefficients or 
the n largest surrogate such that the sequence (si,)ygjr is monotone non-increasing. We 
have observed that the total cost of identifying A„ is then of the order log(?T.) which is 
negligible compared to the computation of the approximation polynomial. We then know 
from the results in ^3.81 that if the assumptions of Theorem 12.81 hold for the solution map 
a I—)■ u{a) and if in addition (||'0il|x)j>i G for some p < 1, we have the convergence 

rate 


sup 

y&U 


uiy) - Uy" 


< Cn-% 

V 


s 



(7.20) 


using these a priori selected (A„)„>i. 

In the next section, we discuss adaptive strategies for the selection of the sets (A„)„>i. 
Given the above result on a priori choices for the A„, one might ask why should we even 
consider adaptive strategies? The answer is that it may be that the best n-term Taylor 
approximations of u actually perform much better than the proven rate 0(rz,“^) established 
using the a priori chosen A„. This is possible, since we do not have any results that say the 
0(n“^) rate is best possible under the assumption (||'^j||)j>i G i^(N) and since the estimates 
of ||G||v by the computable surrogate could be too pessimistic. 


7.2 Adaptive algorithms 

For any downward closed set A, given the Taylor coefficients the equations fl7.12p al¬ 

low us to compute the Taylor coefficients ti^ for u G A^(A), where A^(A) is the set of neighbors 
of A dehned in fl5.13p . This suggests an adaptive algorithm in the same line as the greedy 
interpolation algorithms proposed in §5.2. Once again, we start with an idealized version of 
this algorithm which cannot be applied as such. We discuss more practical versions later. 


Greedy Taylor Algorithm: We start with Ai := {0} the null multi-index. Assuming 
that A„_i has been selected and that the (G)i/ga„_i have been computed, we compute the 
G for 1 / G A(A„_i). We then set 

z/'" := argmax{||G|ly : z^ G A(A„_i)}, (7.21) 

and dehne An = A„_i U {z/"'}. 
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The rationale behind this procedure is that if the sequence (||ty||y)j,gjr were monotone non¬ 
increasing, it would automatically select the n largest terms of this sequence. Similar to the 
greedy interpolation algorithm, the above algorithm needs to be modihed in the following 
two directions: 

(i) In order to guarantee finite complexity of the search in the case of inhnitely many 
variables, the infinite set iV(A) should be replaced by a finite one such as the set of 
anchored neighbors iV(A) dehned by fl5.14p . 

(ii) In order to guarantee convergence, one should alternate the selection of using (I6.5ip 
with a selection rule ensuring that all indices are eventually picked. 

This results in the following algorithm, which is similar to the alternating greedy interpola¬ 
tion algorithm. 

Alternating Greedy Taylor Algorithm: We start with Ai := {0} the null multi-index. 
Assuming that A„_i has been selected and that the have been computed, we 

compute the for u G A(A„_i). If n is even, we set 

:= argmax{||G|ly : z/ G iV(A„_i)}, (7.22) 

and, if n is odd, we set as in (I6.54p . We then define A„ = A„_i U 

It is easily checked that the selection of z/” by (16.541) ensures that the sequence (A„)„>i 
is an exhausion of J^. Therefore, we are ensured that if the Taylor series (13.201) converges 
unconditionally towards u in L°°{U, V) or in any other norm, the approximations produced 
by the greedy algorithm converge towards u in the same norm. However, we do not have 
much information on the rate of convergence. 

In view of the results obtained in §3, a legitimate objective is to build adaptive algorithms 
that can be proven to converge at a rate that is comparable to that which is proved for best 
n-term Taylor approximations, for relevant classes of parametric PDEs. Let us point out 
that a similar objective can be attained when considering the spatial discretization of a 
single PDE, by either adaptive wavelet methods [211 1221 EH] or by adaptive finite element 
methods HQHH [86] . More precisely, these paper show that specific rehnement strategies 
based on a posteriori analysis generate adaptive wavelet sets or adaptive meshes such that 
the approximate solution converges with the optimal algebraic rate allowed by the exact 
solution, as the number of wavelets or elements grows. One key tool in these algorithms, 
is the use of a refinement procedure which guarantees that the error decreases by a hxed 
amount after the refinement is performed. This procedure is called hulk chasing, and requires 
that, in general, more than one wavelet/element is added/refined at each step. 

In the present context of n-term Taylor approximation, it is possible to introduce similar 
bulk chasing procedures, which, in general, require the selection of more than one term from 
the Taylor expansion at each step. Iterating these bulk chasing procedure produces a nested 
sequence (A^)fc>i of downward closed sets with A^ = {0}. Here, we use the notation A^ 


107 












instead of in order to stress that #(A^) is in general larger than k. When we want to 
index these sets by their cardinality, we may define the sets 

A„:=A^ for n = n(fc) := #(A^), (7.23) 

which are indexed by the integers n G {n{k) : k>l}. Of course, some indices n are missed 
in the A„ notation but these can be hlled in by simply repeating the sets A^. The resulting 
sequence (A„)„>i then satishes #(A„) < n. 

We now discuss one specihc procedure of this type for which it is proved, in the particular 
case of the elliptic problem (11.51) . that the resulting adaptive approximation converges with 
an algebraic convergence rate that matches the rate that is established when keeping the 
largest n-terms in the Taylor expansion. For this purpose, we introduced for any hnite 
downward closed set A its margin M(A), defined by 

M(A) := {z/ ^ A : 3j G supp(z/) : u — ej e A}, (7.24) 

Note that the set of neighbors A^(A) can be defined by 

A^(A) := {z/ ^ A : Vj G supp(z/) : u — Cj E A}. (7.25) 

Therefore, we have N{K) C M(A) and this inclusion is generally strict. Still, for any 
downward closed set A, given the Taylor coefficients the relations (I7.12j) allow us to 

compute the Taylor coefficients t^, for u G M(A). 

For the rest of this section, we assume that the Taylor coefficients of the solution map 
are summable, that is, 

J2\Mv<oo. (7.26) 

Note that, according to Theorem 13.91 this holds if the assumptions of Theorem 12.81 are 
satished and if in addition (||'0jl|x)j>i ^ for some p < 1. For any set S' C we 

introduce the quadratic energy 

eiS):=J2\Mv- (7.27) 

ueS 

We also dehne for any finite downward closed set A the quadratic error 

ff(A):=^||(„||^ (7.28) 

u<^A 

Note that since the functions p i—)■ do not form an orthonormal system, this quantity 

differs from the mean-square error between u and its truncated Taylor series. 

The bulk chasing procedure consist in building the new set A^ by adding to A^“^ a subset 
of the margin := M(A^“^) which captures a prescribed portion of its energy, that 

is, such that 

e(F''“^) > ee{M^-^), (7.29) 

for some fixed 0 < 6^ < 1. 
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The objective of this procedure is to reduce the quadratic error by a hxed amount. 

This will be achieved provided that the Taylor expansion fl3.2Up satishes a so-called “satura¬ 
tion property”, which is analogous to that which is sometimes established in order to prove 
convergence of adaptive finite element methods, see [361 EOl E]- In ^7.31 we establish the 
validity of this property in the case of the parametric elliptic problem fll.51) . provided that 
the uniform ellipticity property UEA(r) holds for some r > 0. For now, we start from this 
property viewed as a general assumption in order to analyze the convergence of adaptive 
algorithms based on bulk chasing. 

Saturation property: there exists a fixed constant <5 > 1 such that for any finite downward 
closed set A, 

a{A) < 6e{M), (7.30) 

where M = M(A). 

If this property holds and if we use the bulk chasing procedure to construct (A^)fc>i we 
may write 

a{A^) = a{A^-^) - eiS’^-^) < a{A'^-^) - < Ka{A'^-^), (7.31) 

where 

a 

K = 1 - - < 1. (7.32) 

0 

Therefore, the quadratic error decreases by a fixed amount at every step of the algorithm, 
and in particular, at step k we have 


a(A^) < Ck\ 


(7.33) 


where C := k ^ 

The set should be chosen as small as possible, however we need to ensure that 

the new set A^ := A^“^ U is still downward closed. This is executed by the following 
algorithm, which we first present in an idealized form that cannot be applied as such. 

Bulk Chasing Taylor Algorithm: Having fixed 0 < 6^ < 1, we start with A^ := {0} 
the null multi-index. Assume that A^“^ has been selected and that the coefficients t^ have 
been computed for u G A^“^. For all u G we compute the t^ and the quantities 

roy-.= m.ay.{\\ty\\v '■ h > u asid u E . (7.34) 

We dehne as the set of indices v G corresponding to the I largest niy, for the 

smallest value of I such that the bulk condition fl7.29l) is met. We then dehne 

A^ := A^-^ U (7.35) 
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Remark 7.2 The quantity my is introduced in order to guarantee that the new set is 
downward closed. This can always be ensured due to the monotonicity property 

z/, G and i> > u mp < (7.36) 

Another option would be to define as the smallest subset of such that the bulk 

condition fl7.29p is met and such that := is monotone, however it is not clear 

if there is a simple algorithm for determining such a set. 

The above bulk chasing algorithm is not computationally feasible due to the fact that 
the margin M(A) of a hnite downward closed set A has inhnite cardinality in the case of 
countably many variable yj. We want to modify it by restricting the computation of the ty 
and the bulk search to a hnite subset of M^. In order to accomplish this, we take the usual 
view in numerical computation, where we are given a target accuracy e > 0 and we want the 
algorithm to achieve this accuracy as efficiently as possible. So, in our modihed algorithm, 
we design the procedure so that the algorithm terminates when a{A^) < Ce for some hxed 
constant C to be specihed later. We begin by introducing a procedure that computes from 
the Taylor coefficients {ty)y(z\ a hnite version of the margin M = M(A) which captures its 
energy up to accuracy e: if A is a hnite downward closed set with margin M, then 

M = SPARSE(A, ity)y^A, e), (7.37) 

is a hnite subset of M such that A U M is, by dehnition, a downward closed set and such 
that 

e{M\M)<e. (7.38) 

We present in the end of ^7.31 one practical realization of such a procedure in the particular 
case of the elliptic equation (II.hh . Our modihed algorithm is the following. 

Bulk Chasing Taylor Algorithm with e-Accuracy: Having hxed 0 < 6^ < 1, we start 
with A^ := {0} the null multi-index. Assume that A^“^ has been selected and that the 
coefficients ty have been computed for u G A^“^. We dehne 

= SPARSE(A^-\ {ty)y^A^-i,e), (7.39) 

For all u G we compute the ty and the quantities 

my := iaax{\\ty\\v ■ i> > u and u E M^~^}. (7.40) 

We dehne as the set of indices u G corresponding to the I largest my, for the 

smallest value of I such that the bulk condition 

e{S^-^) > ee{M^-^), (7.41) 

A^ := A^-^ U (7.42) 


no 


is met. We then dehne 






The algorithm is stopped when e{M^) < 26e. 


The same computation as in fl7.3ip . now using 07.411) and 07.38P together with the satu¬ 
ration property 07.30p . leads to the reduction inequality 

a(A^) < + Be, (7.43) 

Therefore we are ensured that after sufficiently many steps k, we have 

e(M^) < e{M^) < ct(A^) < 208, (7.44) 

and thus the algorithm terminates. The hnal step may occur before (t(A^) < 26e, but we 
are still ensured by the saturation property 07.3Up that 

a{A'^) <6e{M^) <6{e{M'^)+e) <Ce, C:=6{2e + 1), (7.45) 

and thus we have reached the announced order of accuracy for the quadratic error. 

However, this does not settle the convergence analysis of the algorithm. First, we want 
to relate the accuracy cr(A^) with the number of terms #(A^) in the Taylor approximation 
through a quantitative convergence estimate. Secondly, we also want to retrieve convergence 
estimates for the error between u and its truncated Taylor approximation measured in the 
L°°{U, V) metric. This is the purpose of the next section. 


7.3 Convergence analysis of adaptive algorithms 

We know that if the Taylor series converges conditionally towards u in L°°{U, V) and if the 
sequence (||t,y||y)i.gj- belongs to the sequence space some 0 < p < 1, then there 

exists a sequence of downward closed sets (A„)„>i such that #(A„) = n and such that the 
uniform error bound 


sup 

y€U 


u{y) - 


< Cn-% 

V 


s 



(7.46) 


holds for all n > 1. This holds in particular if the assumptions of Theorem 12.81 hold and if 
in addition (||'0illv)j>i G In this section, we show that this benchmark rate is met by 

the two previously described bulk chasing Taylor algorithms, provided that the saturation 
assumption hold. We begin with a result that describes the rate of decay of the quadratic 
error (j(A„). 


Theorem 7.3 Consider a parametric problem of the form fl7.1l) such that Assumption 
AL holds for a suitable affine representer Assume that (||ti/||v)i/gj- belongs to the 

sequence space for some 0 < p < 1, and that the saturation property holds. Then for 

the bulk chasing Taylor algorithm, the convergence estimate 

a{Arf) <C\\{\\C\\v)u&A%n~^''. r-=--L (7-47) 

"* p z 
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holds for all n G {n{k) : k > 1} where we have used the convention fl7.23p . The constant C 
depends on p, 9 and 6. The same estimate holds for the bulk chasing Taylor algorithm with 
e-accuracy for all n G {n{k) : 1 < k < k{e)}, where k{e) is the step where the algorithm 
terminates. 


Proof: We begin by considering the bnlk chasing Taylor algorithm. We hrst control the 
cardinality of the set which is added to for any k > 1. Recall that this set is 

obtained by picking indices v G corresponding to the I largest dehned by (I7.34p . 

for the smallest valne of I such that the bulk condition fl7.29p is met. Let C denote 
the set corresponding to the I — 1 largest for this value of 1. Since the bulk condition is 
not met by we have 

(1 - ^ (7.48) 

On the one hand, using Stechkin’s Lemma 13.61 we hnd that 

\\tu\\v< Y (7-49) 

and therefore, using the fact that is dominated by the monotone majorant of ||tj/|lv, and 
that I = we obtain 


5^ ll*.|lt < ll(ll*.|lv)..^ll?^(#(S‘-‘))-=’'. 

Combining this with (17.481) we hnd that 

(l-0)e(M^-^) < ||(||t.|k).G^||,^^(#(^^-^))-^L 
Using the saturation property, it follows that 

or equivalently 

For any fc > 1, we may thus control the cardinality of by writing 


#(A") 


k-l 

1=1 



( 7 . 50 ) 

( 7 . 51 ) 

( 7 . 52 ) 

( 7 . 53 ) 

( 7 . 54 ) 


On the other hand, we know from fl7.3ip that cr(A^) > ^a{A^) with k := 

therefore 


#(A") < 1 + 


6 \ V2r KV2r 

l-e) 1 - KV2r 




l/2r 


1 


|, and 
(7.55) 
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This can be rewritten as 


S i!^(i_ri/2.02r ll(ll*"ll>'Wllk(*(A'^) - I)-"- (7.56) 

Using the inequality #(A^) — 1 > |#(A^) in the case where /c > 1, we have thus established 
(I7.47P with constant C := >1- If fc = 1, we simply write 



(7.57) 


which shows that (I7.47p also holds in this case. 

We next consider the bulk chasing Taylor algorithm with e-accuracy and explain how 
fl7.47p can be established for n = n{k) with 1 < fc < k{e) up to an inflation in the constant 
C* by a similar argument. First, with the exact same reasoning which led to fl7.5ip . we obtain 
the estimate 

(1 - e)e{M^-^) < ||(||t.|k).e^||,\(#(^^-^))-^U (7.58) 

We next observe that since > 26e for k < k{e), we have the modihed saturation 

property 

a(A^-i) < 5e{M^-^) < 5(e(M^-^) + e) < 5e(M^-^), (7.59) 

with 6 := (5(1 -I- ^). By the same reasoning, for any n > 1, we obtain the bound 


k—1 ^ 

#(A^) = 1 + 5^#(^^)<1+(— 
1=1 


S \ l/2r- 


u IvJi^eJF 


k-l 




(7.60) 


1=1 


The saturation property implies that cr(A^) > k} V(A^) with k := 1 — Therefore, by the 
same reasoning, we reach (I7.47p with the larger constant C := >1- 

Our next result shows that the benchmark rate fl7.46l) is met under the same assumptions 
as those of the above theorem. 


Theorem 7.4 Consider a parametric problem of the form fl7.ip such that the assumptions of 
Theorem \7.3\ hold and such that in addition the Taylor series converges conditionally towards 
u in L°°{U, V). Then, we have for all n > 1 the uniform convergence estimate 

< C\\{\\ty\\v)u(iA\pm'^~''1 ^ - 1 - (’^• 61 ) 

holds for alln G {n{k) : /c > 1}. The constant C depends onp, 6 and 6. The same estimate 
holds for the bulk chasing Taylor algorithm with e-accuracy for all n G {n{k) : 1 < k < 
k{e)}, where k{e) is the step where the algorithm terminates. 


sup 
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Proof: It suffices to prove that (17.471) implies (17.611) for the same value of n, up to a change 
in the constant C. Since the Taylor series converges conditionally and since (||ty|ly)yej- 
belongs to this series also converges unconditionally. We thus have 


sup 

yen 


u{y) - 




(7.62) 


For n = n(k) = #(A„) = #(A*^), we consider the set A* of the indices corresponding to the 
n largest ||tj/||y. Using Lemma [T6l Cauchy-Schwarz inequality and (I7.47p . we write 


^ IKWv + ^ \K\\v 

< \\i\K\\v)ueT\\£p{n + I)""* + A^'^eiK \ 

< \\{\K\\v)ueT\\£p{n + l)~" + 

^ ll(ll^i^lln)!^eJ'l|£p(^ + 1) ^ + A^^C^^'^\\{\\ty\\v)ueT\\£P,'i^ 

< {i+c^^Y\\i\m\v)uGA\£?r.n~Y 

which conhrms (I7.61F □ 


The above theorem assumes that the saturation property is valid. We next prove that this 
property always holds in the particular case of the elliptic problem (11.51) . Recall that we then 
have V = Hq{D) and X = L°^{D). The saturation property turns out to be a consequence 
of the uniform ellipticity assumption UEA(r). In order to see this, we introduce the norm 

11^11,:= (7.63) 

D 

which is equivalent to the U-norm under UEA(r), since we then have 

AYWv < a^in\\v\\Y < ll^lll < Omaxlblly, (7-64) 

where amin ■= niffia-g/) a(x) and Omax := niaxa,g£) a(x) = ||a||x- For u E X and j > 1, we use 
the notation 

d. := llt.lll, (7.65) 

and 

du,j ■= J \i>j\\XU\‘^. (7.66) 

D 

The proof of the saturation property uses the following lemma which relates the above 
quantities. 

Lemma 7.5 Consider a parametric problem of the type (II.5p . with affine representer {'il£j)j>i 
such that UEA(r) holds for some r > 0. Then, we have 

V] dy,j < 'yd,,, 7 := 1 - < 1, (7.67) 

“ flmax 
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and 


dy ^ Oi ^ ^ di/—ejjy 
jGsupp(i^) 


®max 

a := -^- < 1, 

T -|- Clmax 


where ej is the Kroenecker sequence with 1 at position j. 

Proof: The uniform ellipticity assumption implies that, for all x & D, 

\ijj{x)\ < Hx) - r < 7 a(x), 
i>i 


which implies (I7.67p . On the other hand, we take v = t^y in fl7.18p . which gives 

du ^ ^ J" 

igsupp(i/) ^ 

and therefore 

dj/ < - J li^ji ivti/_eji + — J li^ji ivti/i. 

iGsupp(i^) ^ jesupp(y) ^ 

Using fl7.69p in the second term of (17.711) gives 




iesupp(t/) ^ 


from which we derive (17.681) . 


(7.68) 


(7.69) 


(7.70) 


(7.71) 


(7.72) 


□ 


We are now in position to establish the saturation property for the elliptic problem (II.5p 
under the uniform ellipticity assumption. For any downward closed set A and any set S, we 
introduce the modihed quadratic error and energy 

a(A) := e(S') := 'Yjdu- (7.73) 

Theorem 7.6 Consider a parametric problem of the type (II.5p such that UEA(r) holds for 
some r > 0. Then the saturation property (I7.30p holds with S depending on r and Umax- 

Proof: We consider an arbitrary downward closed set A and its margin M := M(A). We 
hrst observe that 

a(A) = e(M)+a(A), A := A U M. (7.74) 

Using (I7.68p . we may write 

<Y^’^-^Y{ Y d^_ej,j^ < ^ +B, (7.75) 

i/^A iGsupp(i/) 
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where 


and 


A: ^ diy-ejj 

i?:=a^( ^ du-ej,j 

u^A js.t.u—ej£A 


~ ^ ^^>3 I ’ 

u^A j s.t. u-\-ej^A 


js.t.u+Cj^A 


(7.76) 


(7.77) 


In this splitting, we have used the fact that if ^ A and Uj ^ 0, we have either u — ej ^ A 
or z/ — Cj G M. Using fl7.67p . we control the first term A by 


A < ay di, = aja^A), 
v^K 

and by the same argument we obtain 

B < a'ye{M). 

Combining these estimates with fl7.75p . it follows that 

(1 — a'y)a{A) < a'je^M), 

and thus by fl7.74p 


(7.78) 


a{A) < (l + -^)e(Af). 

V 1 — ay/ 


(7.79) 


(7.80) 


(7.81) 


Finally, using the norm equivalence (I7.64p . we obtain the saturation property (17.301) with 


constant 6 := (1 + 

r V 1 — 


-a 7 


□ 


We conclude this section by presenting a concrete realization of the procedure SPARSE 
which is used in the bulk chasing Taylor algorithm with e-accuracy, in the particular case of 
the elliptic problem fll.Sp . We again work under UEA(r). We define 

:= A, 

a 

and choose an integer J > 0 large enough such that 


j>j 


< 


ae(A) 


-1 


re. 


(7.82) 


X V1 — ay/ 

where a and y are defined as in Lemma 17.51 and we define 

Al := SPARSE(A, (7)^eA, e) := {u e M ; u - ej e A ^ j < J}. (7.83) 

Clearly M is finite with ^{M) < J#(A). 
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Theorem 7.7 With the above definition of M, one has 

e{M\M)= \Mv<^- 

u£M\M 

Proof: We proceed in a similar way to the proof of Theorem 17.61 by hrst writing 

e(M \M)<a E ( E 

ueM\M iesupp(i/) 


where now 

A:=a ^ ^ ^ =a ^ ^ ^ 

v^M\M j s.t. i/—ejGM\M u^M\M j s.t. h^+ejEAf\M 


(7.84) 


(7.85) 


(7.86) 


and 

B:=a ^ ^ ^ (7.87) 

uGM\M j s.t. i'—ej^M\M pGAuM j s.t.u-\-ejGM\M 

In this splitting, we have used the fact that if z/ e M \ M and Vj 0, we have either 

z/ — Cj e M \ M or z/ — Cj G A U M. Using fl7.67p . we can bound A by 


A < aj diy = ajelM \ M). (7.88) 

ueM\M 


To bound B, we hrst claim that for any z/ G A U M such that i' + Oj G M \ M, we must 
have z/ G A and j > J. Indeed, since z^ + Cj G M \ M, the dehnition of M guarantees that 
z/ + Cj = z/ + Cfc for some z> G A and k > J. li j = k we have our claim, li j ^ k then 
necessarily u — ej E A since A is a downward closed set, and therefore u can be written as 
the sum of z/ — G A and e^, which means that z/ is not in M. Thus, we have verihed our 
claim. From the claim, it follows that the only j’s that may contribute in the summation 
inside B are such that j > J and u — Cj E A. Hence, 



Aj 


uGA j>J 



=“E/a 


)|Vt.| 

ueA j; 

>J 


=“E/a 


Vt, 

ueA i; 

>J 



e(A)<(l 

11 A 


j>J 


Combining the bounds for A and B with fl7.85p . we obtain 


e(M \ M) < 


B 


1 — ay 


< re. 


which by fl7.64p implies fl7.84p . 


(7.89) 

□ 
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7.4 Space discretization and computational cost 

In numerical computation, we need to take into account the additional space discretization 
of the solution map in the space 14 C V. In the case of variational problems of the form 
fl7.4p . one typical such discretization is by the Petrov-Galerkin method: we dehne Uh{a) G 14 
such that 

B{uhia),Vh;a) = L{vh), Vh G 14, (7.90) 

where 14 C G is an auxiliary hnite element space such that dim(14) = dim(14). For elliptic 
problems such as fll.Sp . we have V = V and we may take 14 = 14, which is the standard 
Galerkin method expressed in fl5.3p . We make the assumption that the discrete problem is 
well posed for all a G a{U), that is, Assumption AL also holds for the discrete problem. 
Dehning Uh{y) = Uh{a{y)) for a given afhne representation, we thus have 

B{uh{y),Vh;y) = L{vh), Vh G 14, (7.91) 

and the same computation as in Lemma 17.11 shows that the Taylor coefficients ^ 14 of 
y 1 -^ Uh{y) are computed by solving 

B{t^^h,Vh) = L^{vh), Vh G 14, (7.92) 

where Ly = L when = 0 is the null multi-index and 

Ly{vh)-.= - ^ Bj{ty-ej,h,Vh), (7.93) 

iesupp(i/) 

when V ^ T — {0}. Note that these relations amount in applying the Petrov-Galerkin 
approximation in the recursive computation of the Taylor coefficients ty. 

Non-adaptive and adaptive strategies may therefore be applied in order to compute trun¬ 
cated Taylor expansions of the form 

VnAv) '■= X] (7.94) 

with a similar convergence analysis as for the continuous problem fl7.10p . In particular, if 
the assumptions of Theorem 12.81 hold for the solution map a i—)■ Uh{a) and if in addition 
(||4j||x)j>i £ ^^(‘F), both non-adaptive methods based on a priori bounds for the ||4,/i||y or 
adaptive methods based on bulk chasing have convergence rate 

\\uh - Un,h\\L^{u,v) < Cn~\ s :=- -1. (7.95) 

p 

The constant C is independent of h if in the assumptions of Theorem 12.81 the open set O 
and the bound in fl4.4p can be hxed independently of h. 

Similar to the splitting 06.1051) that was used for the interpolation method, we may split 
the resulting error into 

\\u — Un,h\\L°°{U,V) 4 \\Uh — Un,h\\L°°(Uy) + \\v — Uh\\L°°(U,V)- (7.96) 
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The second term is bounded by the error e{h) of the numerical solver. Therefore we obtain 
an global error bound of the form 

\\u — Un,h\\L°°{u,v) ^ Cn ^ + e{h), (7.97) 

similar to the bound (16.1061) obtained for the interpolation method after space discretization. 

We next turn to the estimate of the computational cost, starting with the offline cost. The 
computation of each individual stored as vectors of dimension Nh of their coordinates 
in the nodal hnite element basis of 14, requires to solve a system. The cost of solving this 
system is of order Ch where Ch is the individual cost of one application often the discrete 
solver. Indeeds it amounts in solving the a discrete problem where we invert the exact same 
stiffness matrix as for the computation of the particular instance Mft(O). Assembling this 
system requires to compute the right hand side which necessitates ||i^||o applications of the 
stiffness matrices associated to the sesquilinear forms Bj. Since Bj is associated to a partial 
differential operator, its stiffness matrices in the nodal basis is sparse and therefore each 
such application has cost smaller of order Nh- We have already observed in ^5.31 that since 
An is a downward closed set, we have < n for each u G A„. The cost of computing an 
individual ty^h is thus at most of the order 

Cosii^) ~ Cft + log(n) Nh. (7.98) 

In the non-adaptive algorithm, we compute the n values of ty^h for £ A„, and therefore the 
total offline cost is at most of order 


(Toff ~ nCh + n log(n) Nh- 


(7.99) 


In adaptive algorithms, we need to take into account the additional computation of the 
ty^h for p in the margin of A„. For the bulk chasing Taylor algorithm with e accuracy, the 
individual cost Ch + log(n) Nh is thus multiplied by n + ^{Mn) where 

#(M„) ;= SPARSE(A„, (4,4.eA„,£). (7.100) 


For the SPARSE procedure that we have proposed in the case of the elliptic problem (11.5p . 
we have #(M„) < Jn where J = J{e) is such that 


3>J 


< e. 




(7.101) 


Having assumed that (||4illx)jGN is C summable, and organizing them such that this se¬ 
quence is non-increasing, we hnd by Lemma 13.61 that J(e) < where s := 1 — 1. It 

follows that the total offline cost for this algorithm is at most of order 

Coff ~ log(n) Nh- (7.102) 


As to the online cost, since the online stage simply amounts in the combination of the 
ty^h for computing Un,h, we hnd that this cost is of the order 


Con ~ "^Nh^ 


(7.103) 


119 












similar to the sparse polynomial interpolation algorithms. 

If e is a targeted order of accuracy, and if we have the error bound fl6.106p . then one way 
to reach this accuracy is to take both Cn~^ and e{h) of the order of e. With h{e) the inverse 
function of e{h), as in fl6.112p . we thus hnd that the non-adaptive Taylor algorithm reaches 
the order of accuracy e at cost at most of order 

Cosie) ~ ^og{e)\Nh{^) and Con(£) ~ (7.104) 

For the bulk chasing Taylor algorithm with e-accuracy, we have the more pessimistic estimate 

Coff(e) ~ log(e)|W(£) and Conie) ^ (7.105) 

due to the inflation by J{e). 

Similar to the interpolation algorithm discussed in §6, both algorithms are immune to the 
curse of dimensionality since these trade-off between accuracy and complexity are obtained 
with inhnitely many variables. 


8 Reduced basis methods 

We turn next to the class of numerical techniques for solving parametric PDFs known as 
reduced basis methods. These method aim at hnding a good subspace W C V, of small 
dimension n, to be used for approximating the elements of the solution manifold A4. We 
know that, for any hxed value of n, the best choice of W is one which gives achieves the 
inhmum in the dehnition fll.dip of the Kolmogorov n-width with /C = A4, however such 
a space, if it exists, is generally out of reach from a computational point of view. The 
reduced basis method uses a space W, which may be suboptimal, spanned by n snapshots 
u{a^),..., u{a'^) from the solution manifold Ai. While these snapshots can be chosen in 
various ways, a particularly interesting strategy proceeds with a recursive greedy selection. 
We present this strategy in ^8.11 In ^8.21 we prove, in the where K is a Hilbert space, that, 
in a certain sense, the resulting spaces W perform almost as well as the optimal n-width 
spaces in terms of convergence rates. A similar analysis is given in ^8.31 in the case of a 
general Banach space. The effect of space discretization on the convergence of the algorithm 
is discussed in ^8.41 and computational cost is analyzed in ^8.51 

8.1 Greedy selection algorithms 

The solution manifold At is a compact set in the Banach space V. While in most applications 
K is a Hilbert space, we describe the greedy algorithm for any compact set /C in any Banach 
space V. We then analyze its performance, hrst in the case K is a Hilbert space, and then 
later in the case of a general Banach space. We describe two versions of a greedy algorithm 
for generating approximation spaces for JC. The hrst version, called the pure greedy algorithm 
is rather ideal, while the second version, called the weak greedy algorithm is more amenable 
to numerical implementation. 
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Pure Greedy Algorithm: We first choose a function go G 1C such that 


ll^olk = niax||^||y. 
geK. 


( 8 . 1 ) 


Since /C is compact, such a go always exists but of course may not be unique. Assuming 
{go, ..., 5'n-i} have been selected, we set 14 := spanl^fQ, • • •, 5'n-i} and we then take gn ^ 1C 
such that 

( 8 . 2 ) 


where 


dist( 5 („, Vn)v = maxdist( 5 f, 14 )y, 

geK. 


dist( 5 f, Vn)v ■= niin 14 - h\\v. 


h€V„ 


(8.3) 


We define Uo := ao{lC)v = niax |4||y and 

g&K 


an ■■= CFn{lC)v ■= sup iuf \\g - v\\v, n>l, (8.4) 

gfZfC V&Vn 

so that 

an := dist(/C, 14)y = dist( 5 f„, 14)v- (8.5) 

This greedy algorithm was introduced, for the case when 1/ is a Hilbert space in [89] and 
subsequently extensively studied in P [67] l68]. 


In the setting of parametric PDEs, it not possible compute for a given g ^ 1C the distance 
dist( 5 ', 14)y, so that one cannot exactly perform the maximization in fl8.2p . However, it is 
possible to introduce a computable error indicator d{g, 14 )v which satisfies 

cd{g, Vn)v < dist(fif, 14)y < Cd{g, Vn)v, S' e /C, ( 8 . 6 ) 

for fixed constants c, G > 0 . Performing the maximization fl 8 . 2 p on d{g,Vn)v is equivalent 
to the application, with 7 := ^, of the following weaker form of the greedy algorithm which 
matches better its application. 

Weak Greedy Algorithm: We £x a constant 0 < 7 < 1. At the first step of the algorithm, 
one chooses a function go E 1C such that 

l4o||y > 7max|4|4. (8.7) 

ge/c 

At the general step, if gfQj • • • > 9n-i have been chosen, we set I 4 := spanl^fg,..., fi'n-i}, and 
we now choose ( 7 ^ G /C such that 

dist{gn, Vn)v > 7 maxdist( 5 (, 14 )y, (8.8) 

96 /c 

to be the next element in the greedy selection. As in the pure greedy case, we introduce 

an ■■= (JniKCjv := dist(/C, 14)y, n > 0, (8.9) 


121 






which now measures the performance of the weak greedy algorithm. Note that if 7 = 1, then 
the weak greedy algorithm reduces to the pure greedy algorithm that we have introduced 
above. With the same definition as above for := cr„(/C)\/, we thus have 

dist(5fn, Vn)v > ICTn- (8.10) 

For both of these algorithms, the sequence (cr„)„>o is monotone non-increasing. It is 
also important to note that neither the pure greedy algorithm or the weak greedy algorithm 
give a unique sequence (gn)n>o, nor is the sequence (crn)n>o unique. In all that follows, the 
notation reflects any sequences which can arise in the implementation of the weak greedy 
selection for the fixed value of 7 . 

8.2 Convergence analysis of greedy algorithms in a Hilbert space 

We are interested in how well the space W, generated by the weak greedy algorithm, ap¬ 
proximates the elements of /C. For this purpose we would like to compare its performance 
measured by cr„ with the best possible performance which is given by the Kolmogorov width 

dn ■= dn{}C)v- (8-11) 

If (c’'n)n>o were bounded by {dn)n>o up to a fixed multiplicative constant, this would mean 
that the greedy selection provides essentially the best possible accuracy attainable by n- 
dimensional subspaces. However, such a general comparison is not to be expected. 

Various comparisons between an and dn have been proven in the literature. A first result 
in this direction, in the case of the pure greedy algorithm applied to a Hilbert space V, was 
given in where it was proved that 

(TnilC)v < Cn2^dn{K:)v, n > 1, (8.12) 

with C an absolute constant. The same result holds with C depending on 7 for the weak 
greedy algorithm. While this is an interesting comparison, it is only useful if dn{K,)v decays 
to zero faster than n~^2~^ which may be a severe assumption. Unfortunately, the above 
result is sharp in the following sense: it was proved in | 6 ] that for all n > 1 and e > 0 there 
exists a compact set /C such that 

an(/C)y > (1 - £)2"d„(/C)y. (8.13) 

This reveals that a direct comparison between cr„(/C)y and dniJOjv is doomed to fail. 

Significant improvements on fl 8 . 12 l) were given in [ 6 ], again in the Hilbert space setting, 
by changing the way of comparing cr„(/C)y and dn{JC)v- Perhaps the most interesting com¬ 
parison is the following: if for some constant U > 0 and some s > 0 , the compact set K. 
satisfies dn{JC)v < C(max{l, n})”® for all n > 0, then there is a constant C which depends 
only on C and s such that 

o'n(V)y < U(max{l, 77 ,})“^, n > 0. (8-14) 
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In other words, for the scale of polynomial decay, the greedy algorithm performs with the 
same decay rates as n-widths. These resnlts were improved npon in [32] and extended to 
the case of a general Banach space V. 

The analysis of the two greedy algorithms above is qnite simple and executed with ele¬ 
mentary results from linear algebra. We consider the case when 1/ is a Hilbert space and 
show that the action of the weak greedy algorithm is captured by a certain lower triangular 
matrix. Note that in general, the weak greedy algorithm does not terminate and we obtain 
an inhnite sequence (5'n)n>o- In order to have a consistent notation in what follows, we dehne 
gn := 0, n > m, if the algorithm terminates at n = m, i.e. if amiJOjv = 0 . 

By {g*n) „>o we denote the orthonormal system obtained from {gn)n>o by Gram-Schmidt 
orthogonalization executed in the natural order. It follows that the orthogonal projector 
from V onto W is given by 

n—1 

Png = '^{g,gdgh 

i=0 

where (•, •) denotes the inner product of V, and, in particular, 

n 

gn Pn+ign ^ ^ ^njg-j i ^n,j {gnigj)i 0 < J < 77.. (8.15) 

j=0 

We consider the inhnite lower triangular matrix 

This matrix incorporates all the information about the weak greedy algorithm on /C. For 
example, the n-th row of A gives the n-th element gn in the greedy selection. The following 
two properties characterize any lower triangular matrix A generated by the weak greedy 
algorithm with constant 7 . With the notation an '■= cr„(/C)v^, we have: 

PI: The diagonal elements of A satisfy < cr„. 

P2: For every m>n, one has — ^n- 

Indeed, PI follows from 


®n,n \\gn .Pnfi'n||y dist(5'n) W)v- 

This shows the upper bound in PI because each element of /C is approximated to error cr„. 
It also shows the lower bound because of the weak greedy selection property fIS.Sp . To see 
P2, we note that for m> n, 

m 

= \\gm - PngmWv < ^aX ||^ - = a^. 

^ g£K. 

3=n 
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Remark 8.1 If A is any infinite matrix satisfyingY*! andP2 with (ct„)„>o a non-increasing 
seguence that converges to 0, then the rows of A form a compact subset of £^(No) where 
No := N U {0}. If K, is the set consisting of these rows, then one of the possible realizations 
of the weak greedy algorithm on this set K with constant 7 will choose the rows in that order 
and A will be the resulting matrix. In this sense, the action of the greedy algorithm on the 
original set K. is completely described by the matrix A. 


It follows from the above remark that there is no loss of generality in assuming that the 
inhnite dimensional Hilbert space V is ^^(Nq) and that g* = ej, where ej is the vector with 
a one in the coordinate indexed by j and is zero in all other coordinates, i.e. {efii = 6j^i. 

With this matrix description of the weak greedy algorithm in hand, estimates for the 
convergence rate of the algorithm rely on an analysis of H or a corresponding matrix when 
V is not necessarily Hilbertian. Notice that the diagonal elements of A give the errors cr„ 
and hence we want a general way to bound the diagonal elements of matrices A with the 
above properties. The following lemma from [32] gives a general way to bound diagonal 
elements of a general lower triangular matrix G. It is applied later to the sections of A to 
obtain convergence results for the weak greedy algorithm. 


Lemma 8.2 Let G = {gij) be a K x K lower triangular matrix with rows gi,... ,^k- If 
W is any m dimensional subspace of for some 0 < m < K, and P is the orthogonal 
projection from onto W, then 


K 


det(G)2 = Yl 


9h< 


2 = 1 


(- 

Vm 


K 


K 


\\PSi 


2=1 


K 


m 




2=1 


K—m 


(8.16) 


where || ■ 11^2 is the euclidean norm of a vector in 


Proof: Let ipi ,..., (p^ be any orthonormal basis for the space W and complete it into an 
orthonormal basis cpi,, px for If we denote by <I> the K x K orthogonal matrix whose 
j-th column is pj, then the matrix G := G<I> has entries Cij = (gj, pj). We denote by Cj, the 
j-th column of G. It follows from the arithmetic geometric mean inequality for the numbers 


that 


n 

i=i 


|Cjll^2 < 


ffl 

-E 

m ^^ 

i=i 




m K 


K 


—Y = [—Y 

m / \m 


j=l i=l 


2=1 


(8.17) 


Similarly, 


K 

n 

J =171+1 


- L ^ 


K 


m 


K 

E 

j=m-\-l 


Ir-112 


K—m 


K 


K 


m 


^||gi -Pgil 


£2 


1=1 


K—m 


(8.18) 
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where we have used the fact that ipj is orthogonal to W when j > m. Now, we invoke 
Hadamard’s inequality for the matrix C, which says that 

K 

(detC')2 < JJ||cj|1^2, (8.19) 

and combine it with relations fl8.17p and fl8.18p to obtain 

( 1 ___ \ ^ / 1 _> \ K—m 

■ ( 8 . 20 ) 

2=1 2=1 

The latter inequality and the fact that | detC| = | det G\ gives fl8.16p . □ 


Let us now see how this lemma is utilized to derive convergence results for the greedy 
algorithm. We continue to restrict ourselves to the case of a Hilbert space and the weak 
greedy algorithm with constant 7 . Later, we indicate how the results change when H is a 
general Banach space. The following theorem, taken from [32], relates the errors (T„(/C)y to 
the n-widths dn{)C)v- 


Theorem 8.3 For the weak greedy algorithm with constant 'y in a Hilbert space V and for 
any compact set fC, the following inegualities between an ■= a„(/C)y and dn ■= dniJOjy hold 
for any N > 0, K > 1, and 1 < m < K, 


K 




< 7 --(^)’ 


K 

K — m 


K—m 


a 


2m 

Af+1 


j2K—2m 


( 8 . 21 ) 


Proof: In what follows, we assume that there exists a space Wm which achieves the infimum 
in the definition of the m-width of /C, that is, such that 


max min \\q — w\\v = dm- 

g&K. weWm 

If such a space does not exist, we may, for each e > 0, find one such that 


max min llgf — tally < dm + £ 

g&K weWm 

and modify the proof below by a limiting argument so as to reach the same conclusion. 

We consider the K x K matrix G = {gij) which is formed by the rows and columns of A 
with indices from {7V +1 ,... ,N + K}. Each row g* is the restriction of row A^ + t of H to the 
coordinates + 1,..., iV + JL. The space Wm determines a sequence space Wm C such 
that dist(gj, Wm)e^ < dm ior i = 1,... K. Let W be the linear space which is the restriction 
of Wm to the coordinates A^ + 1,..., A^ + iL. Obviously, we have dim(iy) < m. Let W be 
an m dimensional space, W C span{eAr+i,..., CAr+i^}, such that W (Z W and let P and P 
be the projections in onto W and W, respectively. Clearly, 

WPgiWi^ <\\Si\\£^ < (^N+i, i = l,...,K, ( 8 . 23 ) 
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where we have used Property P2 in the last inequality. Note that 


gi - PgiWfi < I|gi - PgiWfi = dist(gi, W)i2 < dist(gi, W)i2 < dm, i = l,...,K. (8.24) 


It follows from Property PI that 


K 


K 



(8.25) 


We now apply Lemma for this G and W, and use estimates fl8.23l) . fl8.24p . and fl8.25p to 


derive fICTD . 


□ 


Using this theorem, we now establish convergence results for the weak greedy algorithm, 
showing in particular that if dn{JC)v decays with an algebraic or exponential convergence 
rates, then a similar rate holds for an{JC)v- 

Corollary 8.4 For the weak greedy algorithm with constant 7 m a Hilbert space V, we have 
the following: 

(i) For any compact set K., we have 


a'n(/C)y < \/27 ^do{K){} 


min dm" {PPjv, n>l. 


(8.26) 



d„(/C)y < n > 0 ^ cT„(/C)y < Cie-^^^\ n > 0, (8.28) 


where ci = yS and Ci := Comax {\/27 ^,e^^}. 

Proof: (i) We take N = 0, K = n and any 1 < m < n in Theorem 18.31 Using the 
monotonicity of (ct„)„>o and the fact that Ci < ao < do, we obtain 


n 



(8.29) 


Since x "^(1 — xY ^ < 2 for 0 < x < 1, we derive (I8.26p . 
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(ii) It follows from the monotonicity of (ct„)„>o and fl8.2ip for N = K = n and any 1 < m < n 
that 


2n 


3=n+l 


—2n 


n 


n 


■.m/ \n — rri/ 

In the case n = 2k and m = k we have for any positive integer fc, 

(T4fc < \/27"V<^2fc4- 


2m j2n—2m 


(8.30) 


Assuming that dn{)C)v < C'o(niax{l, n}) ^ for all n > 0, we obtain by induction that for all 
j > 0 and n = 2\ 

(Tn = < C2-^^ < n-% C := 23^+17-2(70. (8.31) 

Indeed, the above obviously holds for j = 0 or 1 since for these values, we have 

(72, <ao = do<Co< (72-*h (8.32) 


Assuming its validity for some j > 1, we find that 

(J2J+1 < ^/2'y \/(y2i ^2^-1 

< 7-i2^V2C'C'o2-*(^+i) 

= \/C7/23*+1(7o7-22-"(^+i) = (72-"(^+i\ 

where we have used the definition of C. For values 2^ < n < 2-^+i, we obtain the general 


result by writing 

(Tn < ct2, < C2-^^ < 2^Cn-^ = (7in-7 (8.33) 

In the case n = 0, we simply write (Tq = do < Cq < Ci. 

(iii) Assuming that dn{}C)v ^ Coe~‘^°"'" for all u > 0, we obtain from (i) for all u > 1, 

(^2n+l < (^2n < V2'J~^ dndo < ^/2C^'J~^< V2Co'J~^e~ 72n+l)7 (8.34) 

This proves 

cTn < n>2. (8.35) 

For the values u = 0 or n = 1, we simply write 

(Tn<(To = do<Co< (8.36) 

which concludes the proof of (iii). □ 


Remark 8.5 Inspection of the above proof shows that in items (ii) and (iii) of Corollary 
8-41 if the same decay rates of dn{lC)y are only assumed within a limited range Q < n < N, 
then the same decay rates of an{]C)v are achieved for the same rate 0 < n < N up to some 
changes in the expressions of the constants Ci, Ci and Ci. 
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8.3 Convergence analysis of greedy algorithms in a Banach space 

We now turn our attention to the performance of the weak greedy algorithm for a compact 
set /C in a general Banach space V. We use the abbreviation an := an{JC)v and dn := dn{)C)v- 
While the development is quite similar to the case of a Hilbert space, there is a slight loss 
in the comparison between an and dn due to the lack of Hilbert space orthogonality. 

As in the Hilbert space case, we associate with the greedy procedure a lower triangular 
matrix A = ia the following way. For each j = 0,1,..., we let Xj G V* be the 

linear functional of norm ||Aj|ly. = 1 that satishes 

(i) = 0, g&Vj, and {ii) Xj{gj) = dist{gj,Vj)v. (8.37) 

The existence of such a functional is a simple consequence of the Hahn-Banach theorem. We 
now let A be the matrix with entries 


— ^j{gi)- 

From (ii) of fl8.37p . we see that A is lower triangular. Its diagonal elements ajj satisfy the 
inequality 

■yaj < Qjj = dist{gj, Vj)v = ctj, (8.38) 

because of the weak greedy selection property fl8.8p . Also, each entry aij satishes 

l«ml = l^jigdl = \^j{gi-g)\ < l|Ajllv*||^i -^||v = ||^i -^||v, j < h 

for every g ^Vj, since XjiVj) = 0. Therefore, we have 

|ai,j| < dist(5fi, Vj)v < Cj, j < i. (8.39) 

Theorem 8.6 For the weak greedy algorithm with constant '-j in a Banach space V and for 
any compact set /C contained in V, we have the following inequalities between an ■= aniJOjv 
and dn ■= dnifOjv: for any N >d, K >1, and 1 < m < K , 

K / K 

n <"»+. ^ 2‘<Y. “"Ti (8.40) 

i=l \i=l j 

Proof: As in the proof of Theorem 18.31 we consider the K x K matrix G which is formed by 
the rows and columns of A with indices from {N + 1,..., N + K}. Let Vm be a Kolmogorov 
subspace of V for which dist(/C, Vm)v = dm- Again, we assume that such a space Vm exists. 
Otherwise we modify the proof given below by adding an arbitrary e > 0 to dm and then 
letting e tend to zero at the end. 

For each i, there is an element hi G Vm such that 

Wdi dist(5'j, Vm'jv — dmy 
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and therefore 


- hi)\ < ||Aj|ly*|| 5 (i - hi\\v < dm- (8.41) 


We now consider the collection of vectors (AAr+i(h),..., A 7 v+x(h)) for all h G Vm- They 
span a space Wm C of dimension at most m. We assnme that dim(Wm) = m (a slight 
notational adjustment has to be made if dim(lTm) < m without affecting the hnal result). 
It follows from fl8.4ip that each row gj of G can be approximated by a vector from Wm in 
the norm to accuracy dm, and therefore in the norm to accuracy y/Kdm- Let P be the 
orthogonal projection of onto W. Hence, we have 


IlSi 'L*gi|l£2 — Kdm, f !)•••) d^- 

It also follows from (I8.39p that 


(8.42) 


1/2 


and therefore 


WPgiWi, < \\gi\\i, < ( ^N+j 

0=1 


^N+j < ^N+i- 

i=l j=l 


(8.43) 


(8.44) 


2=1 


2 = 1 


Next, we apply Lemma [8]2] for this G and W and use estimates fl8.38p . fl8.42p and fl8.44p to 
derive 



/ K \ ™ 


and the proof is complete. 


□ 


In analogy with Corollary 18.41 we can use the above result to establish convergence 
theorem for the weak greedy algorithm in a general Banach space. Since the proof is very 
similar to that of Corollary 18.41 except that we use fl8.40p in place of fl 8 . 2 ip . we only state 
the result and refer to [32] for more details. 

Corollary 8.7 Suppose that V is a Banach space. For the weak greedy algorithm with a 
constant 7 , applied to a compact set 1C <ZV, we have the following: 
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(i) For any n> 1, we have 


, j — 1 n — m 

o'ni^jv < v 27 min n 2 ™ 

l<m<n 






V 


(8.45) 


, j=i 


In particular cr 2 n(/C)y < 27 ^^yndo{JC)vdn{^C)v for all n > 1. 

(ii) For any s > 0, Cq > 0 and e > t), we have 

dn{l^)v ^ C'o(Kiax{l,n > 0 an{Kf)v < Ci(max{l,n > 0, (8.46) 


where Ci depends on Cq, s, 7 and e. 

(iii) For any s > 0 and Cq, Cq > 0, we have 

cln{JC)v < n > 0 ^ cr„(/C)y < n > 0, (8.47) 

where Ci depends on s and Cq, and where Ci depends on Cq, 7 , s, and Cq. 

The statement (ii) in the above corollary shows that there is a loss of | in the algebraic 
rate of decay of an compared to that of dn- It is natural to ask whether this loss is unavoidable 
when proving results in a Banach space. We next provide an example which shows that a 
loss of this type is in general unavoidable. However, there is still a small gap between the 
above corollary and what the example below provide. 

Let us begin by considering the space V = £°°(MU{0}) equipped with its usual norm. We 
consider a monotone non-increasing sequence xq > > X 2 > ■ • • of positive real numbers 

which converges to zero and we dehne 


fj . XjCj, j 0 , 1 ,..., 


(8.48) 


where cj is the Kroenecker sequence with 1 at position j. We consider the compact set 




(8.49) 


From the monotonicity of the x/s, the greedy algorithm for /C in X can choose the elements 
from /C in the natural order /q, /i,.... Hence, 


aj = ajiJOjv = Xj, j > 0. (8.50) 

We want to give an upper bound for the Kolmogorov width of /C. For this, we shall use 
the following result (see (7.2) of Chapter 14 in [M]) on Kolmogorov n-widths of the m- 
dimensional unit ball 6 ™ of C in the metric, in M™: 

dn(^]")v < C'o^log 2 (m/n)j , 1 < n < m/2. (8.51) 
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Let us now define the sequence {xj}j>o so that in position 2^ ^ < j < 2^ — 1 it has the 
constant value 2~^^, for all k > 0, where s > 1/2. It follows that 

(Xn{^)v > cn“^, n > 1, (8.52) 

for some c > 0. We now bound the n-width of /C when n = 2 ^+^ by constructing a good 
space Vn of dimension at most n for approximating /C. The space W is dehned as the span 
of a set E of at most n vectors which we construct as follows. First, we place into E all of 
the vectors, cq, ei,..., e 2 fc_i. Next, for each j = 0,1,... n, we use (18.511) to choose a basis for 
the space of dimension 2 ”'“-^ whose vectors are supported on [ 2 ^+-^, — 1 ] and this space 

approximates in V each of the j] for i = ..., — 1, to accuracy 

where we used the fact that s > 1/2. We place these basis vectors into E so that 

#(E) < 2^ + 2^+^ - 1 < n. (8.53) 

Notice that lx/ < 2“^^^ for i > 2^^. This means that for the space W := span(i?), with 
n = 2 ^"+^, 

dn{JC)v < dist(/C, Vn)v < max 12-2^^ max C'o2-(^+^>2-('=-^i/2^1 

( l<j<n J 

= max Co 2 -^(*+^/ 2 ) . max^ 2 -^(*-^/ 2 )^| < ^^^-(.+i/ 2 )_ 

From the monotonicity of {dn{JC)v)n>o, we obtain that 

dn{}C)v < n > 1. (8.54) 

This example shows that the loss of | which appears in (ii) of Corollary 18.71 can in general 
not be avoided. 

8.4 Space discretization and convergence analysis 

The greedy algorithms introduced in the previous section are at this stage only theoretical 
algorithms because they involve several steps that cannot be implemented numerically. To 
describe a numerical version of these algorithms that are applicable to solving parametric 
PDFs, we place ourselves in the following numerical setting. We assume that we are given 
a target accuracy e > 0 and we wish to hnd a space Vn = spanj^fi,..., gn} where n = n{e) 
such that 

dist(A4,14)v := max dist(n, 14)v = sup dist(M(a), V/)v < (8.55) 

aeA 

and of course we want n to be small. In the reduced basis method, the functions gi are 
picked from the solution manifold A4, or equivalently, are of the form 

gi = M(a*), (8.56) 
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where {a^,... ,a”} are picked from the parameter set A. Our benchmark is given by the 
n-width of M. Namely, we know that as soon as dn{Ai)v < £ then there is a space of this 
dimension n which satishes fl8.55p . We have seen that the theoretical greedy algorithms also 
give us such a space Vn with provable bounds on performance, namely with rate guarantees 
on the growth of n with respect to e comparable to the n-width, as expressed by Corollaries 
18.41 and 18.71 However, the greedy algorithm as it stands cannot be implemented numerically 
for several reasons that we now delineate. 

Issue 1: Computing the greedy selection g^: Once the parameter of the k-th greedy 
selection is identified, the function g^ := u{a^) cannot be computed exactly. In practice, it 
is computed approximately by space discretization in the finite element method in the space 14 . 

This means that we take 

Qk = Uh{a’') e 14, (8.57) 

and so the spaces 14 are subspaces of 14- As explained further, this may be viewed as 
applying the weak greedy algorithm to the approximate solution manifold defined as 

Aih ■= {uh{a) : a G A}. (8.58) 

Recall that 

dist(A4/i, Ai)v = niax ||n(a) — Uh{a)\\v < £{h). (8.59) 

aeA 

where e{h) is the accuracy of the numerical solver. In order to reach the goal fl8.55l) . we pick 
h such that e{h) < e/3. 

Issue 2: Search over the manifold Aih- The k-th greedy step requires a search over the 
entire manifold A4h to choose the next basis function g^. Since the manifold is typically an 
infinite set, this search has to be discretized. 

One way to handle this issue is by hnding a hnite set Aih,e C Aih such that each element 
in Aih is at distance at most e/3 from A4h,£, i.e. 

supdist(n(a), A4h,e)y < £/3. (8.60) 

In practice this discretization is done on the parameter side so that each v G is of the 
form u(a), a G Ae, where Ae is a finite subset of A. If we apply the weak greedy algorithm 
to Aih,£ until we are guaranteed that the resulting space 14 satishes dist(A4/i,£, 14 )y 4 £/3, 
then we are guaranteed that the goal 08.551) is met since 

dist(A4, 14 )y < dist(A4, Mk)v + dist(A4h, Mh,e)v + dist(A4/i,e, 14)y < (8.61) 

Issue 3: Computation of dist(Mft(a), 14)y for a G A (or a G As). At each iteration k of 
the greedy algorithm, we need to compute dist(Mft(a), 14)y to a sufficient accuracy so when 
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selecting gk based on these computed distances we are certain that the weak greedy criterion 
flS.Sp is satisfied. 

Here, we want to avoid computing Uh{a) itself since this is too costly and must be done 
many times, i.e. for each a G As- Instead, this computation is done by a surrogate d{a, Vk)v 
which is typically evaluated by a residual-based a posteriori analysis from the Galerkin 
approximation to Uh{a) from 14. This surrogate satishes 

5d{a, Vk)v < dist{uh{a), 14 )v < fid{a, 14 )v- (8.62) 

A practical construction of this surrogate is discussed further in the particular case of the 
elliptic problem fll.Sp . It follows that maximizing this surrogate in place of the true error 
amounts in applying the weak greedy algorithm with constant 7 := | to the approximate 
solution manifold Aih- 


We can now put together the proposed solutions to each of the stated numerical issues 
1, 2 and 3, and form the following numerical version of the weak greedy algorithm. 

Numerical Weak Greedy Algorithm: We assume we are given a numerical tolerance e 
and that for each subspace I 4 of 14 ? we have, in hand, a surrogate d{a, I 4 ) which satisfies 
fl8.62p with uniform constants 5,(3. We first construct a set Ae of parameters for which the 
discrete set M.h,e satisfies (I8.60p . We now run the pure greedy algorithm on the compact set 
/C := Aih,s however using the surrogate d{a, 14) in place of dist{u{a), 14)y. This means that 
the new element gn = Uh{a'^) is defined by 


a” := argmax{d(a, 14-i) : a ^ Ae}. 
We stop the algorithm at the first value n = n{e) for which 

max{d(a, Vn) : a E Ae} < ^ 


(8.63) 


(8.64) 


The output of this perturbed greedy algorithm is our reduced basis space 14- 


In view of the previous discussion, on issues 1, 2 and 3, the output space satishes the 
goal fl8.55p . As an immediate consequence of Corollary 18.41 we obtain one hrst result on its 
number of steps n{e), which uses assumptions on the n-width of Aih- 


Theorem 8.8 For the above algorithm, we have: 
(i) For any s > 0 and Cq > 0, 


dn{Mh)v < Co{meix{l,n}) n>0 


n{e) < 



£ > 0 , 


where Ci := 7 ^ 2 ^'^+^Co. 


(8.65) 
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(ii) For any s > 0 and Cq, Cq > 0, 

dniMh)v < n>0 n(£) < max | log(^^^-j, o| j , e > 0 , 

( 8 . 66 ) 

where ci = and Ci := Cq max{\/ 27 “^, 

Proof: This is a direct application of items (ii) and (iii) in Corollary 18.41 using the fact that 
dn{M.h,e)v < dn{M.h)v- 

Let us observe that the assumptions in the above theorem are on the decay of the n- 
widths of M-hi in contrast to Corollary 18.41 which uses assumptions on the decay of the 
n-widths of A4. As already explained in ^6.31 the approximate solution map Uh may often 
be viewed as the solution map of a discrete parametrized problem of the form (16.1031) with 
similar properties as the original parametric problem (11.11) . This allows us to apply the same 
techniques as in §4 in order to evaluate dn{M.h)v and justify the validity of the assumptions 
in the above corollary for relevant instances of parametric PDEs. 

In more general cases, we may be able justify the decay of dn{M.)v but not of dn{M.h)v- 
This occurs for example if the solver involves a different hnite element space for each instance, 
such as in adaptive methods. Then, we may still write 

dn{-M.h)v Fi dn{M.)v^{h). (8.67) 

This means, for example that if we start from the assumption that dn{M.)v ^ C'o(max{l, n})“^, 
we need to study the weak greedy algorithm applied to Mh,6, however under the modihed 
assumption 

dn{Mh)v < Co(max{l,n})“® + e{h). (8.68) 

We may then separate n between the ranges {1,... ,N{h)} where e{h) < Co(max{l, n})“® 
and the larger values of n. Then, having hxed e{h) = e/3 and using Remark l8.51 we reach a 
similar conclusion on the order of magnitude of n{e) as in Theorem 18.81 The same holds for 
exponential rates. 

8.5 Computational cost 

We now turn to the analysis of the computational cost required by the numerical weak 
greedy algorithm in order to reach the accuracy goal fl8.55p . For simplicity, we restrict our 
attention to the regime of algebraic rates, that is described by item (ii) in Theorem 18.81 
A similar analysis can be carried out for exponential rates. Here, we only consider linear 
elliptic problems expressed in variational form, which are particular cases of those treated 
in §7: hnd u E V such that 

B{u,v, a) = L{v), V E V. (8.69) 

where where and L are continuous sesquilinear and antilinear forms over V xV and 

V respectively, and where we make the additional assumption that 

a 1-4 R(-, •; a), (8.70) 
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is a continuous linear map X to 03 the set of continuous sesquilinear forms over V xV. We 
work under the following symmetric elliptic version of Assumption AL. 

Assumption ALE: The parameter set A has a complete affine representer and, 

for all a G a{U), the sesquilinear form Bf, qa) satisfies the coercivity conditions fl2.14p and 
it is symmetric when restricted to real valued functions ofV. 

Under such an assumption, the approximate solution Uh{a) G 14 is defined by the 
Galerkin method, that is, 

B{uh{a),Vh;a) = L{vh), Vh G 14, (8.71) 

and can be computed for any given a G a{U) by the numerical solver at cost Ch- 

We turn now to the online cost of the numerical weak greedy algorithm assuming that 
we have already computed in the offline stage the reduced basis elements pk = Uh{a^), 
/c = 0 ,...,?7 , — 1, by using the possibly expensive finite element solver for Uh- Given a query 
a E A, the online stage computes Un{a) G 14, where 14 is the reduced basis space. We recall 
that 14 C 14- We hnd Un{a) by the Galerkin method for 14, that is, 

Biunifl') 1 Vm of LivA), Vn G 14, (8.72) 

This amounts in solving an n x n linear system, where the unknowns are the coefficients 
ai{a) in the decomposition 

n—1 

Un{a) = ^ai{a)gi. (8.73) 

1=0 

Note that, as opposed to stiffness matrices resulting from the discretization of PDEs in a 
nodal hnite element basis, the resulting stiffness matrix 

B„(a) = {B{gk,gi]a))k,i=o,...,n-i- (8.74) 

is generally full. Using a direct solver, such as Gauss elimination, the cost of solving this 
system is therefore or order 

n{ef ~ (8.75) 

ffowever, we also need to take into account the cost of assembling the system, that is, 
computing the above stiffness matrix which depends on a. Since the data vector := 
{L{gk))k=o,...,n-i of this system does not depend on a, its computation can be performed 
during the offfine stage. In order to compute the stiffness matrix, we recall the bilinear 
forms B, Bj and B{-, •, y) defined in §7.11 li y E U and 

a = a{y) = a + '^yj'i/jj, (8.76) 

i>i 


the stiffness matrix is 


^niy) ^ ^ 


(8.77) 






where 


Bji . Ql)^ k^l=0,...,n—l Snd ^n,j ■ {,Bji^Qki Ql)') k,l=0,...,n—li (8.78) 

are n x n matrices. Each of these matrices can be computed in the offline stage, however 
the inhnite sum over j > 1 needs to be truncated at some prescribed level J. In the case 
where {\\'ipj\\x)j>i is P’ summable, and if the xjjj are organized such that the ||'0j||jis: are 
non-increasing with j, then we then know that the L°°{U, V) error in the approximation of 
the solution map y i—)■ Uh{y) resulting from this truncation is of the order 0{J~^) where 
s := i — 1, and therefore the order of accuracy e can be preserved by taking 

J = J{£) ^ e-Vy (8.79) 


We may thus incorporate such a truncation in the definition of the approximation map 
y —)■ Uh{y) used to handle Issue 1. Note that the choice of J depends only on e and is 
independent of h. Therefore, using this Uh, the conclusion of (i) in Theorem 18.81 is retained 
and the cost of assembling the system is 

J{e)n{ey ~ (8.80) 


Note that, once the coefflcents ak{a) are found, computing the hnite element representation 
of the solution Un{a) = X]fc=o has a cost of n{e)Nh. In conclusion, the total online 

cost is of the order 

Conis)-s-^^^ + nis)Nh. (8.81) 

However, note that in some applications, one may only work with the reduced basis repre¬ 
sentation (afc(a))fc=o,...,n-i, without the need to recompute the finite element representation. 
This the case for instance when manipulating a quantity of interest such as a linear scalar 
functional 

77—1 

Q{un{a)) = '^ak{a)Q{gk). (8.82) 

k=0 

Having pre-computed the quantities Q{gk) in the offline stage, the online evaluation of this 
quantity is therefore executed at cost of order e~^P. 

Also note that Un{ci) is not the best approximation of u{a) from W in the norm V since 
it is the Galerkin projection onto Vn, however, from Cea’s lemma one has 


|M(a) — Un{a)\\v < — niin ||M(a) 

a veVn 


v\\v = 


-dist(AI, I4)v, 


(8.83) 


where a is the constant in fl2.14p and R := maxa^j, sn)!!®- This guarantees that we 
reach an error of the prescribed order e between u{a) and its reduced basis approximation 
^n(n) • 


We next discuss the offline cost. The first step is to find an e/3 net Aih,e for Aih- We 
describe this net only through the parameter set A, namely as 


A^h,e '^hi^A^'), 


(8.84) 
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where Ac is a finite subset of A such that 

supdist(a,^£)x < (8.85) 

aGA 

and C is a Lipschitz constant for the map a e-)■ Uh{a). The same type of computation as 
done in fl2.23p for the particular problem fll.5p shows that 

C = ( 8 . 86 ) 

a"' 

is an admissible Lipschitz constant. This implies that the resulting Aih,e satisfies fl8.60p . 
Note that we do not need to compute the elements of Aih,e but only the parameter values 
in Ac- 

We can bound the cardinality of Ac from results on covering numbers and n-widths. Let 
us recall that the covering number Ns := Ns{A,X) is the smallest number of X-balls of 
radius 6 that cover A. Let B{ai, 5), i = 1,... Ns, be such a covering. Note that the a* need 
not be from A but this is easily remedied. Namely, for any such ball we have B{ai, 5)n^ ^ 0, 
and so we choose an a, G B{ai,6) fl A. Then the balls B{ai,26) are a covering of A with 
centers from A. Therefore, taking r] = ■^, we can find a set Ac <Z A satisfying fl8.85p with 

HAc) < N^/2{A,X) = N^{A,X). (8.87) 

The well-known Carl’s inequality [75] gives a bound on the covering numbers Ns{A,X) in 
terms of the n-widths dn{A)x of A. In our case, this inequality gives 

Nr,{A,L'^) ( 8 . 88 ) 

where Ci is a constant depending on s. This gives us the bound 

#(A) < (8.89) 

for a constant Ci that also depends on s. While it is generally not possible, at least in any 
reasonable way, to find a minimal set Ac, in typical settings we can give a simple description 
of a set Ac so that fl8.89p is still satisfied for an appropriate constant Ci. For example, 
whenever we can construct a sequence of spaces Wn for which dist(^, X„)x = 0{n~^), then 
the proof of Carl’s inequality (see e.g. [611) gives an explicit description of such an Ac- In 
particular, under the assumption (||'0j||x:) £ P’ and with the ijjj organized in non-increasing 
X norms, we can take Xn := span{-^i,... ,'^n}, for each n > 1, and the description of Ac 
amounts in defining a specific lattice discretization Uc of U such that Ac = a{Uc)- 

Let us now evaluate the cost of the fc-th step of the numerical weak-greedy algorithm. 
This step includes the computation of the reduced basis element Qk := Uh{a^) once has 

been chosen, using the possibly expensive finite element solver, which has cost of order Ch- 

On the other hand, we must also account for the maximization of the surrogate d{a, Vk-i)v 
over the set Ae- This cost is of order 

#(Ac)sk, (8.90) 
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where Sk is the cost of computing d{a, Vk-i)v for one value of a. 

We now give a derivation of a possible surrogate and evaluate the cost for this particular 
surrogate. Since for the reduced basis solution Uk{a) G 14, we have 

\j^\\uh{a) - Uk{a)\\v < dist{uh{a), 14)v < \\uh{a) - Uk{a)\\v (8.91) 

this surrogate should be an equivalent quantity to \\uh{a) — Uk{a)\\v- We introduce the 
Nh X Nh stiffness matrix 

j 

^h{y) = 'Bh + (8.92) 

for the sesquilinear form in the finite element basis, where B/j and Bhj are the 

corresponding stiffness matrices for B and Bj. Therefore, the coordinate vector Uh{y) of 
Uh{y) = Uh{ci{y)) in the finite element basis is the solution of the W x system 

Bh{y)Uh{y) = Fh, (8.93) 

where the right side does not depend on y. Here J = J{e) is the truncation level, 
already introduced for the evaluation of the online cost and Uh is defined as the discrete 
solution of the trunctated problem. We also introduce the coordinate vectors Gi of the 
reduced basis elements Uh{<F) in the finite element basis. Therefore, a reduced basis solution 
Uk{y) = Uk{a{y)) is represented in the finite element basis by the vector 


k-l 

Uk{y) = Y,ai{y)G,. (8.94) 

i=0 

We introduce an hilbertian norm || • ||* defined in such way that 

\\Wh\\* := \\wh\\v- (8.95) 

whenever H4 is the coordinate vector of Wh G 14. Note that this would coincide with the 
euclidean norm if the finite basis were orthonormal in V. This is generally not the case, but 
nevertheless the computation of this norm is usually of complexity W- We may thus write 

\\uh{y) - Uk{y)\\v = \\Uh{y) - Uk{y)\U. (8.96) 


By fl8.9ip . it follows that 

^\\Uk{y) - Uk{y)\\l < distiukiy),VkYy < \my) - Uk{y)\\l (8.97) 

Our next observation is that since for any y G f4l, 

Ql R 

-{B^Wh, Wh) < {Bh{y)Wh, Wh) < -{BhWh, Wh), (8.98) 

K a 
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one has the norm equivalence 


^\\Wh\U < \\B^"Bh{y)WH\U < -\\Wh\U. 

K a 

Therefore, we can define a surrogate quantity by 

d{y,V,.,)v := \\B-,^B,{y){U,{y)-Uk{y))\U, 

and obtain the equivalence (I8.62p with constants 



This surrogate is computable since we have 




d{y,Vk-i)v = B^ Fh-B,^ Bh{y)Uk{y) 


k-l 


Bh^Fh-Bf^^(Bh + J 2 yj^h,j) ^ai{y)G, 

3=0 


i=0 


(8.99) 


( 8 . 100 ) 


( 8 . 101 ) 


Developing this square norm, we hnd that it the sum of the constant term ||B^^TX||^ and 
of a linear combination of the real numbers atiy), a*(?/)«*/(?/), yjatiy) and yjyjiai{y)aii{y) 
for i, i' = 0,..., /c — 1 and j,j' = 1,..., T. The coefficients of these linear combinations are 
given by the (•,•)* inner products (associated to the || • ||* norm) between pairs of vectors 
chosen from 

B"V;„ * = 0,...,A:-1, j = l,...,J. (8.102) 

The precomputation of these vectors and of their inner product has a cost of order 

kJCh + k'^J^Nh. (8.103) 

Then, the computation of the surrogate d{y, Vk-i)v for each y has cost of order for the 

linear combination to which we must add the cost of computing the Oii{y), which according 
to the discussion on the online cost is of order k^. Therefore 

Sk ~ + k^. (8.104) 

In summary, the total cost of step k of the algorithm, without including the precomputations, 

is 

Ck + HAe){k'^J^ + k^) (8.105) 

so that the total cost up to step n = n{e) is of order 

n{e)Ch(£) + i^{Ae)n{efj{e)^ + ij^{Ae)n{eY. (8.106) 

We need to add the cost of precomputing: 
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(i) the vectors Gi and ^'BhjGi, ior i = 0, ... ,n — 1 and j = 1, ..., J and their 

(•, •)* inner products. 

(ii) the matrices B^ and B^ j ior k = 0,... ,n — 1, which entries are given by the euclidean 
inner product between the vectors Gk and the vectors B/jGj and B/ijGj. 

This precomputing cost is of total order 

n{e)J{e)Gkie) + n{efj{efNkie). (8.107) 

In summary, the total offline cost is of order 

Coff ~ n{e)J{e)Gh{e) + n{efJ{efNh{e) + J{ef + ij-{Ae)n{ef. (8.108) 

Among these terms, the largest is typically the third one which in our algebraic rate regime 
is of order in view of (I8.89p . 

This offline cost is thus potentially extremely large. Note however that it is due to the 
fact that we are using a brutal discrete search over As for the maximization of the surrogate 
quantity, so that there is room for improvement by using more sophisticated optimization 
strategies. Note also that in the case of a parametric problem with moderate number d of 
parameters, the quantity J{e) can simply be replaced by d. 

In conclusion, we hnd that, compared to the polynomial methods discussed in §6 and §7, 
the reduced basis method suffers from a very high offline cost, especially in high parametric 
dimension. This can be compensated by the fact that this method captures the same rate 
of decay as achieved by the optimal n-width spaces, so that a prescribed accuracy e may be 
achieved with a number n = n{e) of reduced basis elements much smaller than the number 
of terms in polynomial expansions for the same accuracy, making the online cost potentially 
lower. 
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